Home Ambis Enterprises LLC Learning Spark: Lightning-Fast Data Analytics by O'Reilly Media (ISBN: 9781492050049)

Learning Spark: Lightning-Fast Data Analytics

Stock photo: cover may vary

Learning Spark: Lightning-Fast Data Analytics Paperback - 2020

Name: Learning Spark: Lightning-Fast Data Analytics
Price: 86.55 AUD
Availability: InStock
Author: O'Reilly Media
ISBN: 9781492050049

by O'Reilly Media

Add to wish list

New

Description

O'Reilly Media. New. BRAND NEW, GIFT QUALITY! NOT OVERSTOCKS OR MARKED UP REMAINDERS! DIRECT FROM THE PUBLISHER!

Ask the seller a question Add to wish list

A$86.55

A$5.82 Delivery within USA
Standard delivery: 5 to 11 days

More delivery options

Ships from Ambis Enterprises LLC (Michigan, United States)

The BIBLIO Guarantee

Details

Title Learning Spark: Lightning-Fast Data Analytics
Author O'Reilly Media
Binding Paperback
Condition New
Pages 397
Volumes 1
Language ENG
Publisher O'Reilly Media
Publication date 2020-08-25
Illustrated Yes
Features Illustrated, Index
Bookseller's Inventory # OTF-S-9781492050049
ISBN 9781492050049 / 1492050040
Weight 1.4 lbs (0.64 kg)
Dimensions 9.2 x 7 x 0.9 in (23.37 x 17.78 x 2.29 cm)
Category Computers - General Information
Library of Congress subjects Machine learning, Data mining - Computer programs
Dewey Decimal Code 006.312
Quantity available 120

About Ambis Enterprises LLC Michigan, United States

Specialising in: New Books, Used Books

Biblio member since 2009

We love books, and love our customers. We underrate our book conditions to ensure you're happy, and handpack our shipments with pride!

Terms of Sale:

30 day return guarantee, with full refund including shipping costs for up to 30 days after delivery if an item arrives damaged. Please Contand us at Admin@lakesidebooks.com

Browse books from Ambis Enterprises LLC

Reader reviews for Learning Spark: Lightning-Fast Data Analytics

Write a review for this book

Important Terms and Guidelines

Please focus on the book’s content and context. Also, add any personal comments as to how you enjoyed the book. Substantiate your likes and dislikes. You may make comparisons to other books.
Reviews must be at least 140 characters in length.
Please do not reveal critical plot elements.
This is not a help line. Contact customer support if you need help.

Your review must not include:

Obscenities, discriminatory language, or other insulting language not suitable for public domain
Advertisements, “spam” content, or references to other products, offers or websites.
Email addresses, URLs, phone numbers, physical addresses or other contact information.
Overly critical comments about other reviews or reviewers
Time-sensitive material (i.e. promotional tours, seminars, lectures, etc.)
Availability, price, or alternative ordering/shipping information

From the publisher

Data is bigger, arrives faster, and comes in a variety of formats and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark.

Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms. Through step-by-step walk-throughs, code snippets, and notebooks, you ll be able to:

Learn Python, SQL, Scala, or Java high-level Structured APIs
Understand Spark operations and SQL Engine
Inspect, tune, and debug Spark operations with Spark configurations and Spark UI
Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka
Perform analytics on batch and streaming data using Structured Streaming
Build reliable data pipelines with open source Delta Lake and Spark
Develop machine learning pipelines with MLlib and productionize models using MLflow

About the author

Jules S. Damji is a senior developer advocate at Databricks and an MLflow contributor. He is a hands-on developer with over 20 years of experience and has worked as a software engineer at leading companies such as Sun Microsystems, Netscape, @Home, Loudcloud/Opsware, Verisign, ProQuest, and Hortonworks, building large scale distributed systems. He holds a B.Sc. and an M.Sc. in computer science and an MA in political advocacy and communication from Oregon State University, Cal State, and Johns Hopkins University, respectively.

Brooke Wenig is a machine learning practice lead at Databricks. She leads a team of data scientists who develop large-scale machine learning pipelines for customers, as well as teaching courses on distributed machine learning best practices. Previously, she was a principal data science consultant at Databricks. She holds an M.S. in computer science from UCLA with a focus on distributed machine learning.

Tathagata Das is a staff software engineer at Databricks, an Apache Spark committer, and a member of the Apache Spark Project Management Committee (PMC). He is one of the original developers of Apache Spark, the lead developer of Spark Streaming (DStreams), and is currently one of the core developers of Structured Streaming and Delta Lake. Tathagata holds an M.S. in computer science from UC Berkeley.

Denny Lee is a staff developer advocate at Databricks who has been working with Apache Spark since 0.6. He is a hands-on distributed systems and data sciences engineer with extensive experience developing internet-scale infrastructure, data platforms, and predictive analytics systems for both on-premises and cloud environments. He also has an M.S. in biomedical informatics from Oregon Health and Sciences University and has architected and implemented powerful data solutions for enterprise healthcare customers.

BIBLIO is the largest independent book marketplace in the world, with over 100 million books.