Content-Type: multipart/alternative; boundary=c30d5c7551846be2776e2cf916f0330b6603750bf28851b801b95f584efb Date: Mon, 12 Jan 2026 23:17:21 +0000 (UTC) From: =?UTF-8?b?8J+Usw==?= Turing Post Mime-Version: 1.0 Subject: FOD#135: What It Means When AI Labs Step Into Healthcare To: Hidden Recipient X-Hiring: We are hiring, reach out at header-hacker@emailshot.io X-EmailShot-Signature: lUFPJg307YMrClOc7ZL9LJa6fNz_35-ctlNH7xrYnQqwTKGyO6WC0E39DRoPxatxOQYX3IP5QsY4DOKsqxm-GA== --c30d5c7551846be2776e2cf916f0330b6603750bf28851b801b95f584efb Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 ## This Week in Turing Post: * **Wednesday **/ AI 101 series: **Web World Models ** * **Friday** / We will start a **New Series!** =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94 ### =F0=9F=A4=9D From our partners: Vault-Free Privileged Access for Modern= Engineering Teams View image: (https://media.beehiiv.com/cdn-cgi/image/fit=3Dscale-down,forma= t=3Dauto,onerror=3Dredirect,quality=3D80/uploads/asset/file/175d8b9c-c980-4= e42-9419-72575a56dc42/image.png?t=3D1768254986) Follow image link: (https://fandf.co/3MLWk60) Caption:=20 As AI and cloud infrastructure scale, managing privileged access with stati= c credentials and vaults becomes both a bottleneck and a risk. Teleport rep= laces rotated credentials and vaulted secrets with real Zero Trust, issuing= short-lived, cryptographic certificates at runtime for every human, machin= e, and AI agent.=C2=A0 [Discover how vault-free PAM reduces risk and accelerates engineering.](htt= ps://fandf.co/3MLWk60) Learn more (https://fandf.co/3MLWk60) ---------- _**Our news digest is always free. Click on **_[_**the partner=E2=80=99s li= nk**_](https://www.agora.io/en/products/conversational-ai-engine/?utm_sourc= e=3Dturing-post&utm_medium=3Demail&utm_campaign=3Dconvo-ai)_** above to sup= port us or **__[_**Upgrade**_](https://www.turingpost.com/upgrade)__ to rec= eive our deep dives in full, directly into your inbox. Join Premium members= from top companies like _**_Nvidia,_**_ _**_Hugging Face, Microsoft, Googl= e, a16z etc plus AI labs such as Ai2, MIT, Berkeley, .gov_**_, and thousand= s of others to really understand what=E2=80=99s going on with AI =E2=86=92_ Upgrade today (https://www.turingpost.com/upgrade) ----------=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94 **Last week at CES:** Robots! More Robots! And Jensen Huang says they will = have human-level capabilities THIS year. We went to see if robots were awar= e of that. [Watch the video :)](https://youtu.be/PAjb83HHLaU) Youtube: Jensen Huang says robots will have human capabilities this year! T= he Robots at CES Had.. Other Plans (https://youtu.be/PAjb83HHLaU) Also last week: **Why OpenAI and Anthropic Chose Healthcare at the Same Tim= e** Right after the holidays, both [OpenAI](https://openai.com/index/openai-for= -healthcare/) and [Anthropic](https://www.anthropic.com/news/healthcare-lif= e-sciences) announced healthcare-focused initiatives within days of each ot= her. For the first time, I don=E2=80=99t think about it as a competition, w= hat I like about it is that it=E2=80=99s a signal that **healthcare has cro= ssed a threshold where staying out is no longer the cautious choice.**=20 For several years, healthcare was treated as a deferred domain for leading = AI labs. Understandably: the sector is heavily regulated, operationally fra= gmented, and unforgiving to confident mistakes. Earlier generations of mode= ls were difficult to bound, difficult to audit, and prone to failure modes = that could not be cleanly isolated from their successes. In low-stakes doma= ins, this was ok. In healthcare =E2=80=93 not at all. The decision by both labs to move now implies a shared conclusion that some= thing fundamental has changed. The **models are **for sure more capable now= , but most importantly =E2=80=93 they are **more governable. ** **Healthcare is **therefore better understood as **a systems test **rather = than a market opportunity. **This is a hugely important step in AI adoption= .** Another moment worth mentioning: doctors should not be worried. **What AI i= s being applied to is coordination. **It=E2=80=99s an old problem in health= care that no one is structurally positioned to assemble full context under = time pressure: information is distributed across multiple systems, and sign= als from medications, labs, imaging, wearables, genetics, and prior history= are rarely considered together when decisions are made =E2=80=93 and patie= nts are left to play detectives putting all the pieces together on their ow= n. In this framing, **LLMs** are not making medical judgments. They **mainl= y help bring existing information together so it can be reviewed more easil= y**.=20 Both labs appear to believe this coordination role is now stable enough to = **turn into a product.** **Where the two labs differ is in how they approach this coordination role.= **=20 **OpenAI is extending its general assistant** into healthcare, treating hea= lth data as another high-value context that can sit alongside documents, ca= lendars, and enterprise tools, with additional privacy and access controls = layered on top. The underlying assumption is that a single, familiar interf= ace can serve patients, clinicians, and administrative workflows, as long a= s the boundaries around data use are clearly defined. **Anthropic is taking a narrower approach.** Its healthcare effort is orien= ted less toward a patient-facing assistant and more toward embedding Claude= inside existing institutional workflows. The emphasis is on predictable be= havior, limited scope, and alignment with how healthcare organizations alre= ady operate. Rather than broad continuity across use cases, the focus is on= fitting cleanly into specific professional contexts. The choices what to focus on reflect different theories of how trust is bui= lt in regulated systems. One assumes trust emerges from continuity and wide= spread use, the other from constraint and institutional alignment. It is no= t yet clear which approach will prove more durable, and it is possible that= both will coexist in different parts of the system. What matters is that b= oth labs are now willing to test their models in an environment where respo= nsibility cannot remain abstract. I=E2=80=99m very excited about this new d= evelopment. ---------- _**Follow us on **__ _=F0=9F=8E=A5_[ YouTube](https://www.youtub= e.com/@RealTuringPost)__ __[Twitter](https://x.com/TheTuringPo= st)__ __[ Hugging Face ](https://huggingface.co/Kseniase)_=F0= =9F=A4=97 ----------=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94 ## Twitter Library=20 11 New Interesting Policy Optimization Techniques: (https://www.turingpost.= com/p/policyoptimization) Upgrade (https://www.turingpost.com/upgrade) =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94 ### We are reading * [On the Slow Death of Scaling](https://papers.ssrn.com/sol3/papers.cfm?ab= stract_id=3D5877662) by Sara Hooker * [a16z: The Power Brokers](https://www.notboring.co/p/a16z-the-power-broke= rs)** **by Packy McCormick * [Zhipu AI and MiniMax Just Went Public, But They're Not China's OpenAI](h= ttps://recodechinaai.substack.com/p/zhipu-ai-and-minimax-just-went-public) = by Recode China * [Inside MiniMax: Testing if AGI is Possible Without Infinite VC Money](ht= tps://www.turingpost.com/p/minimax) ### News from the usual suspects * **Gmail Gets Gemini-fied** Gmail is [stepping into](https://blog.google/products-and-platforms/product= s/gmail/gmail-is-entering-the-gemini-era/) 2026 with Gemini AI at the helm.= Google=E2=80=99s flagship inbox now offers AI Overviews to summarize email= threads, answer natural language queries, and filter clutter with the upco= ming =E2=80=9CAI Inbox.=E2=80=9D Help Me Write and Suggested Replies get sm= arter, while proofreading goes premium. It=E2=80=99s no longer just email = =E2=80=93 it=E2=80=99s your AI-powered executive assistant. * **Apple + Google: The Gemini Marriage** Apple has [picked](https://www.cnbc.com/2026/01/12/apple-google-ai-siri-gem= ini.html?) Google=E2=80=99s Gemini to power the long-delayed AI upgrade to = Siri, marking a rare alliance between rivals. The multiyear partnership put= s Gemini models at the core of Apple=E2=80=99s upcoming =E2=80=9CFoundation= Models,=E2=80=9D keeping compute mostly on-device and in Apple=E2=80=99s p= rivate cloud. Apple remains mum on the $1B/year price tag, but this move si= gnals Cupertino is finally showing up to the AI arms race =E2=80=93 fashion= ably late, of course. * **Musk's Macrohard Moment** xAI, Elon Musk=E2=80=99s AI venture, [torched $7.8 billion](https://www.blo= omberg.com/news/articles/2026-01-09/musk-s-xai-reports-higher-quarterly-los= s-plans-to-power-optimus) in just nine months, chasing its dream of powerin= g humanoid robots like Optimus. Despite swelling quarterly losses, revenue = doubled to $107 million, and a $20B cash injection (featuring Nvidia) sugge= sts the spending spree is far from over. "Macrohard" may be a pun on Micros= oft =E2=80=93 but the burn rate is no joke. =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94 ### =F0=9F=94=A6 Research highlight View image: (https://media.beehiiv.com/cdn-cgi/image/fit=3Dscale-down,forma= t=3Dauto,onerror=3Dredirect,quality=3D80/uploads/asset/file/799cafea-4c15-4= 8a0-a50b-3c612b854903/Screenshot_2026-01-12_at_4.33.23_PM.png?t=3D176825361= 4) Caption:=20 Researchers from MIT CSAIL present Recursive Language Models (RLMs), a nove= l inference-time architecture enabling LLMs to process arbitrarily long pro= mpts =E2=80=93 scaling beyond 10 million tokens, over 100=C3=97 typical con= text windows. Instead of consuming the prompt directly, RLMs offload it int= o a Python REPL as a variable (`context`), allowing the LLM to symbolically= interact with the prompt via code. The model can read, transform, and deco= mpose the context and recursively call sub-LLMs through a built-in `llm_que= ry()` function. This enables dynamic task decomposition, selective context = access, and unbounded reasoning. RLMs require no retraining and work with e= xisting models (GPT-5, Qwen3-Coder), achieving up to 2=C3=97 higher accurac= y than base LLMs and long-context agents on benchmarks like BrowseComp+, OO= LONG, and OOLONG-Pairs, while keeping inference cost comparable or lower. A= blation studies confirm the critical role of both the REPL environment and = recursive sub-calls in solving complex, information-dense tasks. **This is a significant step forward because RLMs break the fundamental con= text window barrier of LLMs =E2=80=93 enabling scalable, symbolic, and recu= rsive reasoning over massive inputs without retraining or architectural cha= nges** =E2=86=92[read the paper](https://arxiv.org/abs/2512.24601) ## Models * **Liquid: LFM2.5 =E2=80=93 The Next Generation of On-Device AI** Release an open-weight 1.2B-class model family optimized for edge agents by= extending pretraining to 28T tokens, scaling post-training with multi-stag= e reinforcement learning, and shipping text, Japanese, vision-language, and= native audio variants with day-zero runtime support across common inferenc= e stacks and NPUs [=E2=86=92read the paper](https://www.liquid.ai/blog/intr= oducing-lfm2-5-the-next-generation-of-on-device-ai) * **MiMo-V2-Flash Technical Report** Deliver fast, strong reasoning and agentic performance by combining a large= MoE backbone with hybrid attention, multi-token prediction, and multi-teac= her on-policy distillation to push decoding speed and parameter efficiency = [=E2=86=92read the paper](https://arxiv.org/abs/2601.02780)=20 * **K-EXAONE Technical Report** Provide a multilingual MoE foundation model with long-context support that = targets balanced reasoning, agentic, and industrial capabilities across mul= tiple major languages [=E2=86=92read the paper](https://arxiv.org/abs/2601.= 01739) * **LTX-2: Efficient Joint Audio-Visual Foundation Model** Generate temporally synchronized video and audio in a single unified model = by coupling asymmetric modality-specific transformers through cross-attenti= on for efficient, controllable audiovisual synthesis [=E2=86=92read the pap= er](https://arxiv.org/abs/2601.03233) =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94 ## Research this week=20 _(_=F0=9F=8C=9F_ indicates papers that we recommend to pay attention to)_ **World models, environments, and embodied learning** * **Digital Twin AI: Opportunities and Challenges from Large Language Model= s to World Models** Unify how AI augments digital twins across modeling, mirroring, interventio= n, and autonomous management stages [=E2=86=92read the paper](https://arxiv= .org/abs/2601.01321) * =F0=9F=8C=9F **WebGym: Scaling Training Environments for Visual Web Agent= s with Realistic Tasks (Microsoft)** Provide a large-scale, non-stationary web environment with rubric-based rew= ards to train and evaluate visual web agents [=E2=86=92read the paper](http= s://arxiv.org/abs/2601.02439) * **Scaling Behavior Cloning Improves Causal Reasoning** Show that scaling data and depth in behavior cloning improves causal polici= es in real-time video game agents [=E2=86=92read the paper](https://arxiv.o= rg/abs/2601.04575) * **Evolving Programmatic Skill Networks** Grow a compositional network of executable skills that reflect, refactor, a= nd stabilize over time in open-ended environments [=E2=86=92read the paper]= (https://arxiv.org/abs/2601.03509) **Agents, tools, and orchestration** * **Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Co= mplex Reasoning** Route across models and tools using training-free priors and reinforcement = learning to exploit heterogeneity in complex reasoning tasks [=E2=86=92read= the paper](https://arxiv.org/abs/2601.03872) * **MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning** Interleave multimodal chain-of-thought reasoning with autonomous tool invoc= ation to solve open-ended, real-world problems [=E2=86=92read the paper](ht= tps://arxiv.org/abs/2512.23412) * **RelayLLM: Efficient Reasoning via Collaborative Decoding** Coordinate small and large models at the token level so lightweight models = request help only when needed to cut inference cost [=E2=86=92read the pape= r](https://arxiv.org/abs/2601.05167) * =F0=9F=8C=9F **Over-Searching in Search-Augmented Large Language Models (= Apple)** Diagnose when retrieval harms efficiency and truthfulness and propose metri= cs and mitigations for search overuse [=E2=86=92read the paper](https://arx= iv.org/abs/2601.05503) =E2=86=92 * **Can We Predict Before Executing Machine Learning Agents?** Replace costly execution with predictive reasoning by internalizing executi= on priors and using a predict-then-verify loop [=E2=86=92read the paper](ht= tps://arxiv.org/abs/2601.05930) * **GenCtrl: A Formal Controllability Toolkit for Generative Models** Formalize controllability as a control problem and estimate controllable se= ts to expose the limits of human influence over generation [=E2=86=92read t= he paper](https://arxiv.org/abs/2601.05637) **Agent memory, long-horizon reasoning, and experience compression** * **SimpleMem: Efficient Lifelong Memory for LLM Agents** Compress interaction histories into high-density semantic memory units, con= solidate them asynchronously into abstractions, and retrieve them adaptivel= y to reduce token cost while preserving long-term performance [=E2=86=92rea= d the paper](https://arxiv.org/abs/2601.02553) * **MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents** Represent memories across semantic, temporal, causal, and entity graphs and= retrieve them via policy-guided traversal to enable interpretable, query-a= ligned long-horizon reasoning [=E2=86=92read the paper](https://arxiv.org/a= bs/2601.03236) * **Memory Matters More: Event-Centric Memory as a Logic Map for Agent Sear= ching and Reasoning** Organize experiences into an event graph with explicit logical relations to= support structured navigation over memory instead of shallow similarity se= arch [=E2=86=92read the paper](https://arxiv.org/abs/2601.04726) * **Distilling Feedback into Memory-as-a-Tool** Amortize inference-time critique by storing feedback as retrievable guideli= nes that agents can reuse as a tool to reduce reasoning cost [=E2=86=92read= the paper](https://arxiv.org/abs/2601.05960) **Agent evaluation, verification, and confidence** * **Agent-as-a-Judge** Evolve evaluation from single-pass model judging to agentic judges with pla= nning, tools, collaboration, and memory to enable verifiable multi-step ass= essment [=E2=86=92read the paper](https://arxiv.org/abs/2601.05111) * **Agentic Rubrics as Contextual Verifiers for SWE Agents** Generate repository-specific rubric checklists via agent interaction to ver= ify code patches without executing tests while remaining grounded and inter= pretable [=E2=86=92read the paper](https://arxiv.org/abs/2601.04171) * **Confidence Estimation for LLMs in Multi-turn Interactions** Measure and improve confidence calibration across turns by formalizing mono= tonicity and per-turn reliability as context accumulates [=E2=86=92read the= paper](https://arxiv.org/abs/2601.02179) * **Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood C= onsistency** Evaluate belief robustness by probing consistency across contextual neighbo= rhoods rather than relying on point-wise self-consistency [=E2=86=92read th= e paper](https://arxiv.org/abs/2601.05905) **Reasoning dynamics, structure, and control** * **DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs** Reformulate chain-of-thought generation as an iterative denoising process t= o enable retrospective correction of reasoning steps [=E2=86=92read the pap= er](https://arxiv.org/abs/2601.03559) * **The Molecular Structure of Thought: Mapping the Topology of Long Chain-= of-Thought Reasoning** Analyze long reasoning traces as structured interaction patterns and guide = the synthesis of stable reasoning trajectories [=E2=86=92read the paper](ht= tps://arxiv.org/abs/2601.06002) * **Mechanistic Interpretability of Large-Scale Counting in LLMs through a = System-2 Strategy** Decompose large counting tasks into reliable subproblems and trace how inte= rmediate counts are represented and aggregated inside the model [=E2=86=92r= ead the paper](https://arxiv.org/abs/2601.02989) * **Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners** Probe how latent reasoning forms across languages and show that internal re= asoning dynamics largely follow an English-centered pathway [=E2=86=92read = the paper](https://arxiv.org/abs/2601.02996) * **Parallel Latent Reasoning for Sequential Recommendation** Scale reasoning width by exploring multiple latent reasoning trajectories i= n parallel to improve generalization under real-time constraints [=E2=86=92= read the paper](https://arxiv.org/abs/2601.03153) **Training efficiency, data efficiency, and optimization** * **SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Iss= ue Resolving** Push lightweight supervised fine-tuning to state-of-the-art SWE performance= through curated datasets, curriculum design, and verifier-based test-time = scaling [=E2=86=92read the paper](https://arxiv.org/abs/2601.01426) * **One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling** Demonstrate that a single, carefully engineered training sample can unlock = broad reasoning gains across domains via reinforcement learning =E2=86=92[r= ead the paper](https://arxiv.org/abs/2601.03111) * **Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate= Forgetting** Suppress destructive gradients on confident-but-conflicting tokens by gatin= g updates with entropy to reduce catastrophic forgetting during fine-tuning= [=E2=86=92read the paper](https://arxiv.org/abs/2601.02151) * **Learnable Multipliers: Freeing the Scale of Language Model Matrix Layer= s** Replace fixed norm equilibria with learnable scaling factors to adapt weigh= t magnitudes to data and improve downstream performance [=E2=86=92read the = paper](https://arxiv.org/abs/2601.04890) * =F0=9F=8C=9F **GDPO: Group reward-Decoupled Normalization Policy Optimiza= tion (Nvidia) ** Decouple reward normalization in multi-reward reinforcement learning to pre= serve signal resolution and improve training stability [=E2=86=92read the p= aper](https://arxiv.org/abs/2601.05242) =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94 _That=E2=80=99s all for today. Thank you for reading! Please __**send this = newsletter to colleagues**__ if it can help them enhance their understandin= g of AI and stay ahead of the curve._ =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94 Share Turing Post You currently have 0 referrals, only 3 aw= ay from receiving 1 Month of Premium Subscription. Or copy and paste this link to others: https://www.turingpost.com/subscribe= ?ref=3DygguHVsrXN =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94 =E2=80=94=E2=80=94=E2=80=94 You are reading a plain text version of this post. For the best experience,= copy and paste this link in your browser to view the post online: https://www.turingpost.com/p/fod135 --c30d5c7551846be2776e2cf916f0330b6603750bf28851b801b95f584efb Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Mime-Version: 1.0 FOD#135: What It Means When AI Labs Step Into Healthcare<= body class=3D"a" style=3D"margin:0px auto;padding:0px;word-wrap:normal;word= -spacing:normal;background-color:#FFFFFF;">
plus = robots from CES  ‌ ‌ ‌ ‌ &= #8204; ‌ ‌ ‌ ‌ ‌= 0;‌ ‌ ‌ ‌ ‌ ‌&= #160;‌ ‌ ‌ ‌ ‌ ̴= 4; ‌ ‌ ‌ ‌ ‌ &#= 8204; ‌ ‌ ‌ ‌ ‌ = ;‌ ‌ ‌ ‌ ‌ ‌&#= 160;‌ ‌ ‌ ‌ ‌ ‌= ; ‌ ‌ ‌ ‌ ‌ = 204; ‌ ‌ ‌ ‌ ‌ = ‌ ‌ ‌ ‌ ‌ ‌= 60;‌ ‌ ‌ ‌ ‌ ‌=  ‌ ‌ ‌ ‌ ‌ R= 04; ‌ ‌ ‌ ‌ ‌ &= #8204; ‌ ‌ ‌ ‌ ‌= 0;‌ ‌ ‌ ‌ ‌ ‌&= #160;‌ ‌ ‌ ‌ ‌ ̴= 4; ‌ ‌ ‌ ‌ ‌ &#= 8204; ‌ ‌ ‌ ‌ ‌ = ;‌ ‌ ‌ ‌ ‌ ‌&#= 160;‌ ‌ ‌ ‌ ‌ ‌= ; ‌ ‌ ‌ ‌ ‌ = 204; ‌ ‌ ‌ ‌ ‌ = ‌ ‌ ‌ ‌ ‌ ‌= 60;‌ ‌ ‌ ‌ ‌ ‌=  ‌ ‌ ‌ ‌ ‌ R= 04; ‌ ‌ ‌ ‌ ‌ &= #8204; ‌ ‌ ‌ ‌ ‌= 0;‌ ‌ ‌ ‌ ‌ ‌&= #160;‌ ‌ ‌ ‌ ‌ ̴= 4; ‌ ‌ ‌ ‌ ‌ &#= 8204; ‌ ‌ ‌ ‌ ‌ = ;‌ ‌ ‌ ‌ ‌ ‌&#= 160;‌ ‌ ‌ ‌ ‌ ‌= ; ‌ ‌ ‌ ‌ ‌

Last week a= t CES: Robots! More Robots! And Jensen Huang says they will have human-= level capabilities THIS year. We went to see if robots were aware of that. = Watch the video :)

=
=

January 12, 2026   |   = Listen Online=   |   Read Online

FOD#135: What It Means When AI Labs Step Into Healthca= re

plus robots from CES

3D"like"
<= /td>
 
3D"comment"
<= /td>
<= /tr>
3D"share
 
3D"share
 
=3D"share
=
 =
3D"share
3D""=

This Week in Turing Post:

  • Wednesday / AI 101 series: Web World Models=

  • Friday / We will start a New Series!

 

=F0=9F=A4=9D From our partners: Vault-F= ree Privileged Access for Modern Engineering Teams

3D""

As AI and cloud inf= rastructure scale, managing privileged access with static credentials and v= aults becomes both a bottleneck and a risk. Teleport replaces rotated crede= ntials and vaulted secrets with real Zero Trust, issuing short-lived, crypt= ographic certificates at runtime for every human, machine, and AI agent.=C2= =A0

Discover how vault-free PAM = reduces risk and accelerates engineering.

Learn more

Our news digest is always free. Click on the partner=E2=80=99s link above to support us or Upgrade to receive our deep dives in full, directly into your inbo= x. Join Premium members from top companies like Nvidia,=C2=A0Hugging Face, Microsoft, Google, a16z etc plus AI labs such a= s Ai2, MIT, Berkeley, .gov, and thousand= s of others to really understand what=E2=80=99s going on with AI =E2=86=92<= /i>

=
Upgrad= e today
 
=
3D""
= 3D"YouTube

Jensen Huang says robots will have= human capabilities this year! The Robots at CES Had.. Other Plans

Also last week: Why OpenAI and A= nthropic Chose Healthcare at the Same Time

Right after the holidays, both OpenAI and Anthrop= ic announced healthcare-focused initiatives within days of each = other. For the first time, I don=E2=80=99t think about it as a competition,= what I like about it is that it=E2=80=99s a signal that healthcare has = crossed a threshold where staying out is no longer the cautious choice.= =C2=A0

For several years, healthcare was treated as a deferred doma= in for leading AI labs. Understandably: the sector is heavily regulated, op= erationally fragmented, and unforgiving to confident mistakes. Earlier gene= rations of models were difficult to bound, difficult to audit, and prone to= failure modes that could not be cleanly isolated from their successes. In = low-stakes domains, this was ok. In healthcare =E2=80=93 not at all.

The decision by both labs to move now implies a shared conclusion that so= mething fundamental has changed. The models are for sure more capabl= e now, but most importantly =E2=80=93 they are more governable.

=

Healthcare is therefore better understood as a systems test rather than a market opportunity. This is a hugely important step in A= I adoption.

Another moment worth mentioning: doctors should not = be worried. What AI is being applied to is coordination. It=E2=80=99= s an old problem in healthcare that no one is structurally positioned to as= semble full context under time pressure: information is distributed across = multiple systems, and signals from medications, labs, imaging, wearables, g= enetics, and prior history are rarely considered together when decisions ar= e made =E2=80=93 and patients are left to play detectives putting all the p= ieces together on their own. In this framing, LLMs are not making me= dical judgments. They mainly help bring existing information together so= it can be reviewed more easily.

Both labs appear to believe th= is coordination role is now stable enough to turn into a product.

Where the two labs differ is in how they approach this coordination = role.=C2=A0

OpenAI is extending its general assistant int= o healthcare, treating health data as another high-value context that can s= it alongside documents, calendars, and enterprise tools, with additional pr= ivacy and access controls layered on top. The underlying assumption is that= a single, familiar interface can serve patients, clinicians, and administr= ative workflows, as long as the boundaries around data use are clearly defi= ned.

Anthropic is taking a narrower approach. Its healthcare = effort is oriented less toward a patient-facing assistant and more toward e= mbedding Claude inside existing institutional workflows. The emphasis is on= predictable behavior, limited scope, and alignment with how healthcare org= anizations already operate. Rather than broad continuity across use cases, = the focus is on fitting cleanly into specific professional contexts.

The choices what to focus on reflect different theories of how trust is b= uilt in regulated systems. One assumes trust emerges from continuity and wi= despread use, the other from constraint and institutional alignment. It is = not yet clear which approach will prove more durable, and it is possible th= at both will coexist in different parts of the system. What matters is that= both labs are now willing to test their models in an environment where res= ponsibility cannot remain abstract. I=E2=80=99m very excited about this new= development.

=

Follow us on =C2=A0=F0=9F=8E=A5 YouTube=C2=A0Twitter= =C2=A0 = Hugging Face =F0=9F=A4=97

 
=

Twitter Library

&nb= sp;

11 New Interesting Policy O= ptimization Techniques

Policy optimizat= ion is one of the most exciting topics for the AI community right now. Why?=

www.turingpost.com/p/policyoptimization

 
=

That=E2=80= =99s all for today. Thank you for reading! Please send this newsl= etter to colleagues if it can help them enhance their understand= ing of AI and stay ahead of the curve.

Upgrade
=
 

We are reading

=

News from the usual suspects

=  

=F0=9F=94=A6 Research highlight

=
3D""

Researchers from MIT C= SAIL present Recursive Language Models (RLMs), a novel inference-time archi= tecture enabling LLMs to process arbitrarily long prompts =E2=80=93 scaling= beyond 10 million tokens, over 100=C3=97 typical context windows. Instead = of consuming the prompt directly, RLMs offload it into a Python REPL as a v= ariable (context), allowing the LLM to symbolically interact w= ith the prompt via code. The model can read, transform, and decompose the c= ontext and recursively call sub-LLMs through a built-in llm_query() function. This enables dynamic task decomposition, selective context = access, and unbounded reasoning. RLMs require no retraining and work with e= xisting models (GPT-5, Qwen3-Coder), achieving up to 2=C3=97 higher accurac= y than base LLMs and long-context agents on benchmarks like BrowseComp+, OO= LONG, and OOLONG-Pairs, while keeping inference cost comparable or lower. A= blation studies confirm the critical role of both the REPL environment and = recursive sub-calls in solving complex, information-dense tasks.
This= is a significant step forward because RLMs break the fundamental context w= indow barrier of LLMs =E2=80=93 enabling scalable, symbolic, and recursive = reasoning over massive inputs without retraining or architectural changes =E2=86=92read = the paper

Models

  • Liquid: LFM2.5 =E2= =80=93 The Next Generation of On-Device AI
    Release an open-weight 1.= 2B-class model family optimized for edge agents by extending pretraining to= 28T tokens, scaling post-training with multi-stage reinforcement learning,= and shipping text, Japanese, vision-language, and native audio variants wi= th day-zero runtime support across common inference stacks and NPUs =E2=86=92read the paper<= /p>

  • MiMo-V2-Flash Tech= nical Report
    Deliver fast, strong reasoning and agentic performance = by combining a large MoE backbone with hybrid attention, multi-token predic= tion, and multi-teacher on-policy distillation to push decoding speed and p= arameter efficiency =E2=86=92read the paper=C2=A0

  • K-EXAONE Technical Report
    Provide a multi= lingual MoE foundation model with long-context support that targets balance= d reasoning, agentic, and industrial capabilities across multiple major lan= guages =E2=86=92r= ead the paper

  • LTX-2: Efficient Joint Audio-Visual Foundation Model
    Generate= temporally synchronized video and audio in a single unified model by coupl= ing asymmetric modality-specific transformers through cross-attention for e= fficient, controllable audiovisual synthesis =E2=86=92read the paper

 

Research this week

(=F0=9F=8C=9F indicates papers that we recommen= d to pay attention to)

World models, environments, and embodie= d learning

Agents, tools, and orchestration

  • Atlas: Orchestrating Heterogeneous Models and T= ools for Multi-Domain Complex Reasoning
    Route across models and tool= s using training-free priors and reinforcement learning to exploit heteroge= neity in complex reasoning tasks =E2=86=92read the paper

  • MindWatcher: Toward Smarter Multimodal Too= l-Integrated Reasoning
    Interleave multimodal chain-of-thought reason= ing with autonomous tool invocation to solve open-ended, real-world problem= s =E2=86=92read t= he paper

  • <= b>RelayLLM: Efficient Reasoning via Collaborative Decoding
    Coordinat= e small and large models at the token level so lightweight models request h= elp only when needed to cut inference cost =E2=86=92read the paper

  • =F0=9F=8C=9F=C2=A0Over-Searchin= g in Search-Augmented Large Language Models (Apple)
    Diagnose when re= trieval harms efficiency and truthfulness and propose metrics and mitigatio= ns for search overuse <= span>=E2=86=92read the paper =E2=86=92

  • Can We Predict Before Executing Machine Le= arning Agents?
    Replace costly execution with predictive reasoning by= internalizing execution priors and using a predict-then-verify loop =E2=86=92read the paper=

  • GenCtr= l: A Formal Controllability Toolkit for Generative Models
    Formalize = controllability as a control problem and estimate controllable sets to expo= se the limits of human influence over generation =E2=86=92read the paper

  • =

Agent memory, long-horizon reasoning, and experience compr= ession

  • Si= mpleMem: Efficient Lifelong Memory for LLM Agents
    Compress interacti= on histories into high-density semantic memory units, consolidate them asyn= chronously into abstractions, and retrieve them adaptively to reduce token = cost while preserving long-term performance =E2=86=92read the paper

  • MAGMA: A Multi-Graph based Agen= tic Memory Architecture for AI Agents
    Represent memories across sema= ntic, temporal, causal, and entity graphs and retrieve them via policy-guid= ed traversal to enable interpretable, query-aligned long-horizon reasoning = =E2=86=92read the= paper

  • = Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searchin= g and Reasoning
    Organize experiences into an event graph with explic= it logical relations to support structured navigation over memory instead o= f shallow similarity search =E2=86=92read the paper

  • Distilling Feedback into Memory-as-a-ToolAmortize inference-time critique by storing feedback as retrievable guide= lines that agents can reuse as a tool to reduce reasoning cost =E2=86=92read the paper

Agent evaluation, verification, and confiden= ce

  • Agent-= as-a-Judge
    Evolve evaluation from single-pass model judging to agent= ic judges with planning, tools, collaboration, and memory to enable verifia= ble multi-step assessment =E2=86=92read the paper

  • Agentic Rubrics as Contextual Verifiers for SWE A= gents
    Generate repository-specific rubric checklists via agent inter= action to verify code patches without executing tests while remaining groun= ded and interpretable <= span>=E2=86=92read the paper

  • Confidence Estimation for LLMs in Multi-turn Interact= ions
    Measure and improve confidence calibration across turns by form= alizing monotonicity and per-turn reliability as context accumulates =E2=86=92read the paper=

  • Illusi= ons of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency=
    Evaluate belief robustness by probing consistency across contextual= neighborhoods rather than relying on point-wise self-consistency =E2=86=92read the paper

Reasoning dynamics, structure, and cont= rol

  • DiffC= oT: Diffusion-styled Chain-of-Thought Reasoning in LLMs
    Reformulate = chain-of-thought generation as an iterative denoising process to enable ret= rospective correction of reasoning steps =E2=86=92read the paper

  • The Molecular Structure of Thought= : Mapping the Topology of Long Chain-of-Thought Reasoning
    Analyze lo= ng reasoning traces as structured interaction patterns and guide the synthe= sis of stable reasoning trajectories =E2=86=92read the paper

  • Mechanistic Interpretability of Large-= Scale Counting in LLMs through a System-2 Strategy
    Decompose large c= ounting tasks into reliable subproblems and trace how intermediate counts a= re represented and aggregated inside the model =E2=86=92read the paper

  • Large Reasoning Models Are (= Not Yet) Multilingual Latent Reasoners
    Probe how latent reasoning fo= rms across languages and show that internal reasoning dynamics largely foll= ow an English-centered pathway =E2=86=92read the paper

  • Parallel Latent Reasoning for Sequential Rec= ommendation
    Scale reasoning width by exploring multiple latent reaso= ning trajectories in parallel to improve generalization under real-time con= straints =E2=86= =92read the paper

Training efficiency, = data efficiency, and optimization

  • SWE-Lego: Pushing the Limits of Supervised Fine-t= uning for Software Issue Resolving
    Push lightweight supervised fine-= tuning to state-of-the-art SWE performance through curated datasets, curric= ulum design, and verifier-based test-time scaling =E2=86=92read the paper

  • One Sample to Rule Them A= ll: Extreme Data Efficiency in RL Scaling
    Demonstrate that a single,= carefully engineered training sample can unlock broad reasoning gains acro= ss domains via reinforcement learning =E2=86=92read the paper

  • Entropy-Adaptive Fine-Tuning: Resol= ving Confident Conflicts to Mitigate Forgetting
    Suppress destructive= gradients on confident-but-conflicting tokens by gating updates with entro= py to reduce catastrophic forgetting during fine-tuning =E2=86=92read the paper<= /p>

  • Learnable Multipli= ers: Freeing the Scale of Language Model Matrix Layers
    Replace fixed= norm equilibria with learnable scaling factors to adapt weight magnitudes = to data and improve downstream performance =E2=86=92read the paper

  • =F0=9F=8C=9F=C2=A0GDPO: Group r= eward-Decoupled Normalization Policy Optimization (Nvidia)
    Decouple= reward normalization in multi-reward reinforcement learning to preserve si= gnal resolution and improve training stability =E2=86=92read the paper

 

How did you like it?

 

Share Turing Post

You curre= ntly have 0 referrals, only 3 away from r= eceiving 1 Month of Premium Subscription.

= Click to Share

Or copy and pas= te this link to others: https://www.turingpost.com/subscribe?ref=3DygguHVsrXN=

 
3D"yt"
3D"tw"3D"in"
 

Update your email prefere= nces or unsubscribe here

© 2026 Ksenia Se

1434 Western Ave, Suite 1 #4796
Alban= y, New York 12203, United States

3D"beehiivPowered= by beehiiv
Terms of Service
--c30d5c7551846be2776e2cf916f0330b6603750bf28851b801b95f584efb--