JUG Milano Meeting #157
Venerdì 13 Settembre 2024
DataFrame - a Swiss Army Knife of Java Data Processing
Evento ibrido online ed in presenza.
La partecipazione **in presenza** è gratuita e libera, ma è OBBLIGATORIA la registrazione su: form di registrazione per partecipare a JUG Milano in presenza Qualche ora prima dell'evento riceverete un invito da un sistema di ticketing diverso da eventbrite per confermare l'accesso all'edificio.
Il ticket di eventbrite NON da' accesso all'edificio, ma ci serve unicamente per tenere traccia ordinata delle email e cap-pare chi interessato a partecipare in presenza.
La partecipazione **in presenza** è gratuita e libera, ma è OBBLIGATORIA la registrazione su: form di registrazione per partecipare a JUG Milano in presenza Qualche ora prima dell'evento riceverete un invito da un sistema di ticketing diverso da eventbrite per confermare l'accesso all'edificio.
Il ticket di eventbrite NON da' accesso all'edificio, ma ci serve unicamente per tenere traccia ordinata delle email e cap-pare chi interessato a partecipare in presenza.
Abstract dell'intervento:
Can we use big data techniques without big data infrastructure? As Java developers, we deal with data processing all the time. We may be analyzing app logs, extracting data from Excel files, copying tables between different databases (simple ETL), etc. Yet, the “standard” Java falls short in processing capabilities when compared to more complex and heavy “big data” solutions like Spark or Flink. This talk will focus on “DataFrame” - an in-memory 2-dimensional table with operations like filtering, column / row transformations, joins, aggregations, etc. I will use an open source DFLib library (https://dflib.org) and Jupyter notebook to demonstrate how to do data processing and visualization in any Java app without much fuss.
Can we use big data techniques without big data infrastructure? As Java developers, we deal with data processing all the time. We may be analyzing app logs, extracting data from Excel files, copying tables between different databases (simple ETL), etc. Yet, the “standard” Java falls short in processing capabilities when compared to more complex and heavy “big data” solutions like Spark or Flink. This talk will focus on “DataFrame” - an in-memory 2-dimensional table with operations like filtering, column / row transformations, joins, aggregations, etc. I will use an open source DFLib library (https://dflib.org) and Jupyter notebook to demonstrate how to do data processing and visualization in any Java app without much fuss.
A cura di Andrus Adamchik:
Andrus is a passionate open-source developer and a member of the Apache Software Foundation. He started programming in Java back in 1998, and founded a number of open-source projects: Apache Cayenne - a developer-friendly ORM, Bootique.io - a lightweight Java app platform, Agrest.io - a framework for dynamic REST services, and DFLib - DataFrame for Java. In his day job, Andrus is an IT entrepreneur, running a software company called ObjectStyle.
Andrus is a passionate open-source developer and a member of the Apache Software Foundation. He started programming in Java back in 1998, and founded a number of open-source projects: Apache Cayenne - a developer-friendly ORM, Bootique.io - a lightweight Java app platform, Agrest.io - a framework for dynamic REST services, and DFLib - DataFrame for Java. In his day job, Andrus is an IT entrepreneur, running a software company called ObjectStyle.