Papers Read on AI

0
Tech News #73

Keeping you up to date with the latest trends and best performing architectures in this fast evolving field in computer science.Selecting papers by comparative results, citations and influence we educate you on the latest research.Consider supporting us on Patreon.com/PapersRead for feedback and ideas.

Recent Episodes
  • ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases
    Nov 1, 2024 – 32:59
  • Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities
    Oct 31, 2024 – 30:12
  • Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
    Oct 30, 2024 – 39:12
  • F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
    Oct 18, 2024 – 35:59
  • LightRAG: Simple and Fast Retrieval-Augmented Generation
    Oct 17, 2024 – 37:42
  • Aria: An Open Multimodal Native Mixture-of-Experts Model
    Oct 16, 2024 – 17:56
  • AgentKit: Structured LLM Reasoning with Dynamic Graphs
    Oct 15, 2024 – 30:22
  • PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling
    Oct 14, 2024 – 33:45
  • Diffusion Models are Evolutionary Algorithms
    Oct 10, 2024 – 31:05
  • Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech Countering
    Oct 9, 2024 – 39:11
  • LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
    Oct 8, 2024 – 36:51
  • Internal Consistency and Self-Feedback in Large Language Models: A Survey
    Oct 7, 2024 – 01:20:28
  • On the Diagram of Thought
    Oct 2, 2024 – 17:27
  • 3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
    Oct 1, 2024 – 46:12
  • StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
    Sep 30, 2024 – 28:41
  • On the limits of agency in agent-based models
    Sep 24, 2024 – 32:39
  • Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization
    Sep 23, 2024 – 17:23
  • PuLID: Pure and Lightning ID Customization via Contrastive Alignment
    Sep 22, 2024 – 29:56
  • MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery
    Sep 21, 2024 – 33:14
  • PuLID: Pure and Lightning ID Customization via Contrastive Alignment
    Sep 20, 2024 – 29:56
  • Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
    Sep 19, 2024 – 30:36
  • LLaMA-Omni: Seamless Speech Interaction with Large Language Models
    Sep 18, 2024 – 32:15
  • GeoCalib: Learning Single-image Calibration with Geometric Optimization
    Sep 17, 2024 – 19:16
  • Artificial Immune System of Secure Face Recognition Against Adversarial Attacks
    Sep 13, 2024 – 01:10:54
  • Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
    Sep 12, 2024 – 29:24
  • rerankers: A Lightweight Python Library to Unify Ranking Methods
    Sep 11, 2024 – 15:39
  • Automated Design of Agentic Systems
    Sep 10, 2024 – 23:55
  • Text2SQL is Not Enough: Unifying AI and Databases with TAG
    Sep 9, 2024 – 42:53
  • Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
    Sep 5, 2024 – 35:05
  • Sapiens: Foundation for Human Vision Models
    Sep 4, 2024 – 25:58
  • OctFusion: Octree-based Diffusion Models for 3D Shape Generation
    Sep 3, 2024 – 33:00
  • Writing in the Margins: Better Inference Pattern for Long Context Retrieval
    Sep 2, 2024 – 29:22
  • Fact Finder -- Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs
    Aug 30, 2024 – 19:53
  • RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation
    Aug 29, 2024 – 18:01
  • RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
    Aug 28, 2024 – 27:28
  • DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
    Aug 23, 2024 – 47:39
  • LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
    Aug 21, 2024 – 38:53
  • ControlNeXt: Powerful and Efficient Control for Image and Video Generation
    Aug 20, 2024 – 26:50
  • OpenResearcher: Unleashing AI for Accelerated Scientific Research
    Aug 19, 2024 – 29:59
  • Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
    Aug 14, 2024 – 33:50
  • AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
    Aug 13, 2024 – 41:29
  • Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
    Aug 9, 2024 – 38:55
  • LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
    Aug 8, 2024 – 29:11
  • CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
    Aug 7, 2024 – 31:47
  • MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
    Aug 5, 2024 – 26:22
  • Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
    Jul 31, 2024 – 34:03
  • FinanceBench: A New Benchmark for Financial Question Answering
    Jul 30, 2024 – 41:34
  • Stable-Hair: Real-World Hair Transfer via Diffusion Model
    Jul 29, 2024 – 30:25
  • Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
    Jul 26, 2024 – 31:03
  • FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
    Jul 25, 2024 – 34:06
Recent Reviews
  • Bland25
    Amazing consistency
    I love what you are doing.
Similar Podcasts
Disclaimer: The podcast and artwork on this page are property of the podcast owner, and not endorsed by UP.audio.