AI Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation https://arxiv.org/abs/2509.25849
Date | Post | Archived |
---|---|---|
2025-10-06 |
![]()
Links for 2025-10-06
AI Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation https://arxiv.org/abs/2509.25849
|
|
2025-10-01 |
![]()
Links for 2025-10-01
Periodic Labs
|
|
2025-09-29 |
![]()
Links for 2025-09-29
AI Julian Schrittwieser, co-first author on AlphaGo, AlphaZero, and MuZero, summarizes studies of recent progress and what we should expect in the next…
|
|
2025-09-26 |
![]()
Links for 2025-09-26
CA-1 Europa
|
|
2025-09-24 |
![]()
Links for 2025-09-24
New Temples for the Sand God
|
|
2025-09-22 |
![]()
Links for 2025-09-22
Google DeepMind discovers new solutions to century-old problems in fluid dynamics
|
|
2025-09-18 |
![]()
Links for 2025-09-18
Google and OpenAI at the International Collegiate Programming Contest World Finals
|
|
2025-09-15 |
![]()
Links for 2025-09-15
Gauss, an agent for autoformalization
|
|
2025-09-10 |
![]()
Links for 2025-09-11
AI system creates expert-level scientific software
|
|
2025-09-06 |
![]()
Links for 2025-09-06
AI New research explains why LLMs hallucinate, through a connection between supervised and self-supervised learning.
|
|
2025-09-02 |
![]()
Links for 2025-09-02
AI A vision and path forward for genetically encoding almost all chemistry, powered by new AI tools…
|
|
2025-08-30 |
![]()
Links for 2025-08-30
AI Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search https://arxiv.org/abs/2508.15884v1
|
|
2025-08-24 |
![]()
Links for 2025-08-24
AI Genie 3: An infinite world model with Shlomi Fruchter and Jack Parker-Holder https://www.youtube.com/watch?v=n5x6yXDj0uo
|
|
2025-08-20 |
Links for 2025-08-20
AI SSRL: Self-Search Reinforcement Learning https://arxiv.org/abs/2508.10874
|
|
2025-08-14 |
![]()
Links for 2025-08-14
AI A conversation with Demis Hassabis on world models (genie 3), deep think, the need for better evals (game arena), and their progress towards AGI.
|
|
2025-08-11 |
![]()
Links for 2025-08-11
AI Neuroscience study provides yet more evidence that AI systems and human brains converge on similar ways of representing the world…
|
|
2025-08-08 |
![]()
Links for 2025-08-08
GPT-5 The wait after GPT-4 was about four months shorter than the wait between GPT-3 and GPT-4.
|
|
2025-08-05 |
![]()
Links for 2025-08-05
Open models by OpenAI
|
|
2025-07-31 |
![]()
Links for 2025-07-31
AI GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning https://arxiv.org/abs/2507.19457
|
|
2025-07-24 |
![]()
Links for 2025-07-24
New Temples for the Sandgod
|
|
2025-07-22 |
![]()
Links for 2025-07-22
Hierarchical Reasoning Model
|
|
2025-07-19 |
![]()
Links for 2025-07-19
ChatGPT agent
|
|
2025-07-16 |
![]()
Links for 2025-07-16
Meta Superintelligence
|
|
2025-07-13 |
![]()
Links for 2025-07-13
Kimi K2: Open Agentic Intelligence
|
|
2025-07-10 |
![]()
Links for 2025-07-10
AI “Reachy Mini: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community.
|
|
2025-07-05 |
![]()
Links for 2025-07-05
AI Energy-Based Transformers are Scalable Learners and Thinkers — “We outscale (feed-forward) transformers while generalizing reasoning/system 2…
|
|
2025-07-02 |
![]()
Links for 2025-07-02
Self-play improvement
|
|
2025-06-30 |
![]()
Links for 2025-06-30
AI Chai-2, a major breakthrough in molecular design.
|
|
2025-06-27 |
![]()
Links for 2025-06-27
AI Steering Your Diffusion Policy with Latent Space Reinforcement Learning — “If you have a policy that uses diffusion/flow (e.g.
|
|
2025-06-25 |
![]()
Links for 2025-06-25
Gemini Robotics On-Device brings AI to local robotic devices
|
|
2025-06-23 |
Links for 2025-06-23
San Mateo-based Generalist showcased its end-to-end neural network in action
|
|
2025-06-21 |
![]()
Links for 2025-06-21
AI The upcoming GPT-3 moment for RL: “We suspect the next AI paradigm will emerge from leveraging existing software to efficiently build training…
|
|
2025-06-18 |
![]()
Links for 2025-06-18
Neural OS
|