Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning
H**K
A very good book on data science, that also covers Google Cloud Platform
I knew this book for me just a few pages into the first chapter. This book by Lake is unlike many other books of data science and particular technology that just enumerate the how-to's of the particular technology. Lak starts with a concrete user problem strongly anchored in probabilistic outcomes, and then steps through a typical data science process of discovery, refinement, and then converting to a production pipeline. While teaching about GCP technologies along the way, the book stays strongly anchored in the original user-problem. There is not a corner of GCP that is needed for a full production data science product that goes untouched in this book. The material is well covered, with pointers to deeper material and user manuals.I received the first edition. As GCP technology evolved, Lak was posting updates to his blog on Medium so that everyone could take understand the updates to GCP and how to use them. I was pleasantly surprised by getting these updates and made having the book that much more valuable.
J**E
If all you care about is the answer and not why, buy another book
While Lak’s conversational style can be a turn off to some who just want an answer and don’t care about how, I liked this book. Many times with books like this you get an answer or a recipe and you’re done. What happens when your answer or recipe isn’t right for the situation? I’m glad Lak explains his rationale and let’s it be known that there’s more than one way to do it. Could the book have been condensed without the explanations? Yes. Would it have been like almost every other book in the space? Yes. Check out this book if you want a well thought out answer and maybe alternates. If you just want the “right answer”, then buy something else.
D**S
Data analysis and engineering is democratized for all
Wow. A true tour of data science and engineering on the cloud.It's been a few years since I've worked with tools in this field, but this book was a clear level-headed view for data engineers looking to derive and drive insights from data. Using a core example use case and following it end to end through the entire book (and indeed cloud tools integrated with each other) helped me keep track of what was going on, and kept things from becoming a book on theory rather than one of accomplishment and answers. The purpose and process for each tool was clear, and I also appreciated the explanations of trade-offs and the value added for the choices made. The practice of data science is a LOT easier now with cloud/serverless tools than eight or nine years ago, and I feel this brought me back to the state of the art.
D**L
Needs updating
I do not understand the high reviews for this book, especially ones written in 2020. I'm only into chapter 2 and the code to download the files fails. There is a supplement on the github page that allowed me to copy the bucket. But, the explanation, like many things is vague and not accurate (you don't provide the path to your bucket, but just the name of the bucket). I assumed this book was an introduction to using the Google Cloud Platform for data science. So I am expecting an introduction. This book has detail where it doesn't need it, and lacks detail where it does. It just assumes you have already been using GCP, but if that were the case this book isn't really needed then.Major Problems:1. Code is not working.2. Code is not explained in any detail.3. Vague details about how to navigate GCP (chapter one has you create a bucket, but doesn't explain what a bucket is, and how to create it, yet there are three pages about the definition of a data engineer).4. Inconsistent assumptions about your background knowledge.Good parts:1. The use of a case study for learning.
B**K
Great resource for data scientists beginning on GCP
The book is easy to follow with detailed descriptions of each step followed to build a project from start to end on the Google Cloud Platform.The book is also accompanied by a code repository which lets the readers try out the project themselves.Strongly recommended for data scientists learning to use the platform.
V**Y
Book covers exactly what the title says.
Excellent book for learning which GCP services can be used for what portions of data analytic pipelines. From data acquisition all the way to model revalidation.
L**R
Narrative structure, working code
Narrative structure in a technical book is hard to find, and this was executed last masterfully, with lots of code examples for you to follow along with on your own. Highly recommended.
J**E
Not a reference book, if you don't work through the first 99 pages you won't understand the 100th
This product is more akin to a course than a reference book. I tried flipping over to the chapter on Cloud-SQL (actually the author only goes into BigQuery so I ended up scrolling through Stack Overflow anyway.) When I finally found the relevant chapter, it was impossible to disentangle the SQL code from the class objects built in the proceeding 6 chapters. Do not buy this book if you have any intention other than reading every single page in order. Otherwise, you'll end up doing what I did, which reading stack overflow and medium articles to mixed effect.
K**8
Para cualquiera que quiere introducirse o conocer del tema con GCP
Cualquier persona que trabaje en ámbito de datos potencialmente usará Google Cloud. Este libro te da un buen fundamento para ello.
A**R
Great
Great
宗**き
一通り学ぶのに適している
良い商品です、英語ですが一通りのことを学べるように書いてあります。グーグルクラウドでデータサイエンスをしようと思っている人間ですが良い入門書となりました。
C**H
Get to the heart of data science straight away 😊👍
Its sets very clear direction for aspiring data engineers / scientists as well what is expected out of them.
L**R
Not about GCP but Python and Shell Scripts
This book really sucks. Instead of explaining the Machine Learning (ML) toolbox on the Google Cloud Platform (GCP), the author gets exited about shell and python scripts that he uses. The only use case discussed (predicting the probability of a flight being late) is very special and has nothing to do with common use cases in ML. But the worst: The wonderfuld GCP ML tools are not discussed at all, only some Big Data tools (Cloud SQL, Pub/Sub, etc.) in a very superficial manner. A complete waste of money!
Trustpilot
1 month ago
3 weeks ago