Roaring Elephant

Roaring Elephant

Autor: Vários
Narrador: Vários
Editor: Podcast
Duración: 300:03:29

Mas informaciones

Sinopsis

Bite-Sized Big Data

Show more

Episodios

Episode 93 – Apache Kylin: Extreme OLAP Engine for Big Data

19/06/2018 Duración: 46min

In this episode Apache PMC member Dong Li joins us to explains how Apache Kylin can deploy Analytical OLAP cubes in your Big Data environment. http://kylin.apache.org/ Dong Li Technical Partner & Senior Architect of Kyligence (linkedin) PMC Member of Apache Kylin http://en.kyligence.io/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 92 – Roaring news

12/06/2018 Duración: 46min

Another week, another edition of Roaring Big Data News. This time, Dave talks about driving teens and Jhon takes a detailed look at an Eventbrite data pipeline article. Breaking News Dave Driver monitoring isn't just for teens; adults can benefit, too https://arstechnica.com/cars/2018/05/buicks-smart-driver-explains-why-my-gas-mileage-sucks-and-my-editors-doesnt/ Jhon Looking under the hood of the Eventbrite data pipeline! https://www.eventbrite.com/engineering/looking-under-the-hood-of-the-eventbrite-data-pipeline/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 91 – ODPi is back and better than ever!

05/06/2018 Duración: 01h08min

In this episode, we welcome back John Mertic, director of Program Management for ODPi, R Consortium, and the Open Mainframe Project. It's been almost two years since we checked in with John and the ODPi initiative and as John mentions in the interview, a lot has changed in Hadoop... ODPi logo John Mertic Director of Program Management for ODPi, R Consortium, and Open Mainframe Project https://www.linkedin.com/in/jmertic/ ODPi website links: https://www.odpi.org/ https://www.odpi.org/blog/2018/04/04/the-state-of-open-source-and-big-data-three-years-later https://www.odpi.org/projects/data-governance-pmc https://www.odpi.org/events Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 90 – Roaring news

29/05/2018 Duración: 38min

In this weeks Roaring News episode, Dave brings up the resilience of Apache Community open source projects and plays some Doom. Jhon has some practical Apache NIFI guides and the emergence of multi modal NoSQL databases. Breaking News DataWorks Summit Berlin video recordings are up: https://www.youtube.com/user/HadoopSummit/playlists Find Dave on his Australian road-trip: http://bit.ly/aus-nz-ibm-hwx-tour Dave DataTorrent, Stream Processing Startup, Folds (Apache Apex) https://www.datanami.com/2018/05/08/datatorrent-stream-processing-startup-folds/ DOOM! https://arxiv.org/abs/1804.09154 https://www.technologyreview.com/s/611072/ai-generates-new-doom-levels-for-humans-to-play/ https://www.youtube.com/watch?v=K32FZ-tjQP4 Bonus doom news: https://www.rockpapershotgun.com/2018/03/28/dodge-fireballs-forever-in-a-neural-nets-doom-nightmare/ https://worldmodels.github.io/ Jhon Accessing Feeds from EtherDelta on Trades, Funds, Buys and Sells (Apache NiFi) https

Escucha
Episode 89 – DataWorks Summit San Jose Agenda Review

22/05/2018 Duración: 01h12min

With the San Jose edition of the DataWorks Summit only a month away, we go over the sessions that are available in the agenda today and offer our top picks. If you're going, or if you will be watching the replays online, we hope to guide you on your selection of sessions. DataWorks Summit San Jose 2018 And here is the dashboard we created with statistics on the San Jose sessions, for your enjoyment: https://aka.ms/DWS2018SJ The agenda is still in flux so we will be updating the dashboard regularly. Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 88 – Roaring News

15/05/2018 Duración: 35min

Returning to our more regular schedule, we have a Roaring News episode today. Dave has articles on multi-cloud readiness, Big Data being a pariah, and Google Duplex and Jhon came up with Synthetic data, data engineers and scientists and a Neural Network sharing cake recipes. Breaking News Dave Less than 10% ready for multi cloud http://www.cloudpro.co.uk/cloud-essentials/hybrid-cloud/7451/idc-less-than-10-of-organisations-are-ready-for-multi-cloud Tech companies distancing themselves from Big Data https://qz.com/1262102/tech-companies-are-distancing-themselves-from-big-data/ Google Duplex https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html Jhon The Rise of Synthetic Data to Help Developers Create and Train AI Algorithms Quickly and Affordably https://insidebigdata.com/2018/05/08/rise-synthetic-data-help-developers-create-train-ai-algorithms-quickly-affordably/ Data engineers vs. data scientists https://www.oreilly.com/ideas/data-enginee

Escucha
Episode 87 – Druid: a high-performance, column-oriented, distributed data store – part 2

08/05/2018 Duración: 31min

This is the second part of an interview with Fangjin Yang, co-founder and CEO at Imply and committer/PMC member for the Druid project. Druid: a high-performance, column-oriented, distributed data store which has entered the Hadoop environment with the recent integration with Apache and we since Druid has been around for a while, we are grateful to FJ for spending some time with our listeners. Fangjin Yang Cofounder and CEO at Imply (linkedin) Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 86 – Druid: a high-performance, column-oriented, distributed data store – part 1

01/05/2018 Duración: 31min

This is the first part of an interview with Fangjin Yang, co-founder and CEO at Imply and committer/PMC member for the Druid project. Druid: a high-performance, column-oriented, distributed data store which has entered the Hadoop environment with the recent integration with Apache and we since Druid has been around for a while, we are grateful to FJ for spending some time with our listeners. Fangjin Yang Cofounder and CEO at Imply (linkedin) Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 85 – DataWorks Summit Community Showcase Exhibitor Soundbites

24/04/2018 Duración: 30min

This is the final part of our coverage of the DataWorks Summit Berlin 2018. Normally we would not have had an episode this week, since we were in Berlin last week, but we had lightning interviews with the vendors in the Community Expo Are and used that coverage to make this episode. So less of "Dave & Jhon" and more "ecosystem tech" snippets this time. Even though this does stray a bit from our usual content, we still hope it is useful. This was recorded in a hotel room and on the expo floor so the audio quality is not up to our usual standards, we hope you’ll forgive us! Here is a timestamped list of the lightning interviews: 02:41 Hortonworks https://hortonworks.com/ 06:28 Alation https://alation.com/ 08:45 Arcadia Data https://www.arcadiadata.com/ 11:12 Attunity https://www.attunity.com/ 13:10 BlueMetrix https://www.bluemetrix.com/ 15:27 BMW https://www.bmw.com 18:04 IBM https://www.ibm.com 19:54 Microsoft https://www.microsoft.com 22:15 Nutanix https://www.nutanix.com/ 23:26

Escucha
Episode 84 – DataWorks Summit Berlin – Day 2 Recap

19/04/2018 Duración: 01h30min

And with the end of day two of the 2018 DataWorks Summit in Berlin comes the end of this years Europe Summit. But never fear, we have an extra 90 minutes of DataWorks goodness for you to consume on your way home. No real editing on this one, recording in a hotel room so audio quality may not be up to our usual standards, we hope you'll forgive us! Enjoy! Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 83 – DataWorks Summit Berlin – Day 1 Recap

18/04/2018 Duración: 01h23min

Another year, another European Dataworks Summit, and yes, another daily recap show from Jhon and Dave. We walk through the keynotes and sessions we attended and give our thoughts and views. This should be useful for anyone who wasn't able to attend or those seeking to peek into sessions they couldn't make. No real editing on this one, recording in a hotel room so audio quality may not be up to our usual standards, we hope you'll forgive us! Enjoy! Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 82 – DataWorks Summit Berlin 2018 Preview

10/04/2018 Duración: 47min

Next week is DataWorks Summit Berlin week! Your two hosts will be in attendance and in this episode we go over the agenda and plan which sessions we want to attend and why. Peppered throughout we add further insights and experiences from previous years. Unfortunately, Dave's network was a little unstable and there are a couple audio glitches in this episode. For some session statistics or if you can use some help deciding what sessions you want to attend, you can use the dashboard we created: Click the screenshot above or go to http://aka.ms/DWS2018 to access the dashboard. It is a dynamic report: clicking on graph elements (bars of pie slices) will apply filters on all the visualizations and the session list. Use control-click to combine filters. At some point the dashboard will dissapear because it is no longer relevant. for future reference, here is a large version of the screenshot. Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future ep

Escucha
Episode 81 – Roaring News

03/04/2018 Duración: 26min

In this installment of Big Data News, we talk about the recent Facebook leak, how everybody is still doing it wrong (according to some at least) and installing Hadoop "the old-fashioned way". Also briefly covered is Elastic's X-Pack, now even more "open" than before, but still rather closed it would seem. Breaking News Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 80 – Big Data Tracking

27/03/2018 Duración: 51min

Last June, Wolfie Christl published a 93 page report Corporate Surveillance in Everyday Life using big data tracking. Apart from the massive pdf that can be downloaded on the net, an extensive summary can be found on the Cracked Labs website. In this episode we go over the content and give our views on the subject. If you want to follow along with us while we are discussing the different point in the onlin earticle, here is the link: http://crackedlabs.org/en/corporate-surveillance Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 79 – Roaring News

20/03/2018 Duración: 37min

Another Big Data news episode! This time we consider the Big or small nodes conundrum based on an article that after close scrutiny doesn't really seem to test the real issue. Other things that get covered are Linkedin's Dynanometer, Cloudera's full production architecture advise for a recommendation service and a really interesting visualization technique based on blobs. Breaking News Big Data, Small Nodes https://insidebigdata.com/2018/02/22/make-sense-big-data-small-nodes/ Dynamometer Release https://github.com/linkedin/dynamometer https://venturebeat.com/2018/02/08/linkedin-open-sources-dynamometer-for-hadoop-performance-testing-at-scale/ Cisco IoT predictions Aka someone somewhere trots out the old “data is the new oil” trope for one more circuit, please please please stop? https://www.networkworld.com/article/3257769/internet-of-things/7-transportation-iot-predictions-from-cisco.html Production Recommendation Systems with Cloudera http://blog.cloudera.com/blog/2018/02

Escucha
Episode 78 – Apache Trafodion transactional SQL for Hadoop (Part 2)

13/03/2018 Duración: 01h04min

This episode, a group of people from Esgyn join us to talk about the Apache Trafodion transactional SQL for Hadoop database engine. In this second part Rohit, Ken and Rao talk about the internal workings and best practices of Apache Trafodion. Rohit Jain Chief Technology Officer (linkedin) https://esgyn.com Ken Holt Chief Operating Officer and Co-Founder (linkedin) https://esgyn.com Rao Kakarlamudi VP of Pre-sales & Principal Architect (linkedin) https://esgyn.com In Search of Database Nirvana (oreilly) By Rohit Jain Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 77 – Roaring News

06/03/2018 Duración: 47min

Another Roaring News wpisode where we cover recent Big Data News items we found interesting. This time we talk about Open Source turning 20 years old, the annoyances that come with Smart Homes and a big data device in Germany. Additionally, we talk about some introductory guides to AI. Breaking News 20 years of open source + who contributes http://www.zdnet.com/article/open-source-turns-20/ https://www.infoworld.com/article/3253948/open-source-tools/who-really-contributes-to-open-source.html Smart home living is annoying as hell https://gizmodo.com/the-house-that-spied-on-me-1822429852 Big Data Divide https://www.politico.eu/article/to-protect-or-collect-germanys-big-data-divide/ The Art of Learning Data Science https://medium.com/@aparnack/the-art-of-learning-data-science-65b9f703f932 The Long Road To Become a Big Data Scientist - Infographic https://medium.com/@aparnack/sequel-to-the-art-of-learning-data-science-cb2e1f078e5a An executive’s guide to AI http

Escucha
Episode 76 – Apache Trafodion transactional SQL for Hadoop (Part 1)

27/02/2018 Duración: 45min

This episode, a group of people from Esgyn join us to talk about the Apache Trafodion transactional SQL for Hadoop database engine. In this first part Rohit, Ken and Rao talk about the history and goals behind the Apache Trafodion. Rohit Jain Chief Technology Officer (linkedin) https://esgyn.com Ken Holt Chief Operating Officer and Co-Founder (linkedin) https://esgyn.com Rao Kakarlamudi VP of Pre-sales & Principal Architect (linkedin) https://esgyn.com In Search of Database Nirvana (oreilly) By Rohit Jain Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 75 – Roaring News

20/02/2018 Duración: 32min

In this Big Data News episode, we discuss the 5 year aniversary of Hadoop Weekly, now Data Engineering Weekly, the Strava "data leak" and Twitter Wars, may the data be with you! Breaking News Five Years of Hadoop Weekly (Joe Crobak @joecrobak @Medium) https://medium.com/@joecrobak/five-years-of-hadoop-weekly-7aa8994f140b https://dataengweekly.com/ https://www.hadoopweekly.com/ How Strava's "anonymized" fitness tracking data spilled government secrets ([Nathan Ruser @Nrg8000] @zackwhittaker @ZDNet) http://www.zdnet.com/article/strava-anonymized-fitness-tracking-data-government-opsec/ http://www.abc.net.au/news/science/2018-01-29/strava-heat-map-shows-military-bases-and-supply-routes/9369490 Tweet Wars - The last data point (@basecamp_ai) http://www.knoyd.com/blog/the-last-data-point Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha
Episode 74 – Hadoop sizing part 3: Compute sizing

13/02/2018 Duración: 49min

As promised, in this final part of our Hadoop Sizing series, we round off the subject with sizing your compute and network resources. Undoubtedly we'll be revisiting this subject in the future, but the three parts of this series should give ample information on the subject for now. Hadoop Node Sizing Hadoop Data Node Density Tradeoff on HCC: https://community.hortonworks.com/content/kbentry/48878/hadoop-data-node-density-tradeoff.html Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Escucha

|<
<<
>>
>|

página 20 de 24