Tue, 13 August 2024
Allright, it is time to pull the curtain on all this AI stuff and really learn how it works! On this episode we dive deep into AI, and Neural Networks, refinenements, vector databases (and why we need them) so you can understand the underlying principles of AI and LLM! The field is so vast, intersting and more importantly it's going to be here to stay. So take a listen and keep learning on this new tool we should all be familiar with! http://www.javapubhouse.com/datadog
And Follow us! |
Thu, 18 April 2024
So we continue to have guests in our show to talk to us about interesting things... This time is about Apache Tika. This is an incredible tool to do search file processing and metadata extraction. Think about that you have tons of unstructured files, like emails, or documents, and you want to extract, index and then search theses. This is Tika's purpose. And who best to walk us through how it does its magic that its Project Management Committee (PMC) Chair, Tim Allison! So take a listen as we go deeper on ingesting tons of content (which is fundamental for things like training LLMs). http://www.javapubhouse.com/datadog
Apache Tika OpenSearch Project and OpenSearch Neural Plugin Tutorials Selected Advanced File Processing toolkits/services Selected Hybrid Search/RAG toolkits (there are _MANY_ others!) Search/Relevance Conferences Tim's personal project
And Follow us! |
Mon, 18 March 2024
We have a great time talking to Matt Topol from Voltron Data on one of his Apache Software Foundation projects called Apache Arrow. It's both a spec and implementation of a columnar data format that is not only efficient, but cross-language compatible. We walk through the scenarios that it covers and how is becoming more and more pivotal for things like ML and LLMs. So come listen to this JPH episode on one of the best and free ways to distribute data and integrate services working on top of that data! http://www.javapubhouse.com/datadog
And Follow us! |