top of page
Search

AI Weekly Briefing – Week Ending 7 November 2025

Reasoning, risk – and agentic AI as real infrastructure


This week’s developments push in three clear directions:

  • Reasoning, not just prediction – especially in complex, regulated domains like cities.

  • Evaluation and risk – more serious work on what our benchmarks and educational deployments are actually doing.

  • Agentic AI as infrastructure – from data clouds and healthcare research to energy, sports and regional transformation programs.


Here’s our curated view across Research and Industry & Policy, with a lens on what matters for organisations in Singapore and Southeast Asia.


Research Highlights


1. Reasoning Is All You Need for Urban Planning AI (NUS)

Paper: Reasoning Is All You Need for Urban Planning AI – Sijie Yang, Jiatong Li, Filip Biljecki (NUS)


This position paper argues that the next frontier in AI for cities is not better prediction models but reasoning-capable planning agents. The authors propose an Agentic Urban Planning AI Framework with three cognitive layers (Perception, Foundation, Reasoning) and six logic components (Analysis, Generation, Verification, Evaluation, Collaboration, Decision), all orchestrated through a multi-agent collaboration setup.


Why it matters:

  • Urban planning decisions are inherently value-based, rule-grounded and politically sensitive. The paper makes a clear case that statistical learning alone isn’t enough – you need explicit reasoning about constraints, trade-offs and stakeholder values.

  • The framework shows how agentic planning agents can augment, not replace, human planners by systematically exploring scenarios, checking regulatory compliance and generating transparent justifications.

  • With authors from NUS and an explicit focus on governance and explainability, this is directly relevant to Singapore’s smart nation, urban analytics and planning ecosystem – a strong reference point for HDB, URA, JTC and related agencies.


2. Measuring What Matters in LLM Benchmarks

Paper: Measuring what Matters: Construct Validity in Large Language Model Benchmarks – Andrew M. Bean et al. (NeurIPS 2025 Datasets & Benchmarks track)


A team of 29 expert reviewers systematically analysed 445 LLM benchmarks from leading ML/NLP venues, and found that many of them do a poor job of measuring what they claim – especially for abstract constructs like “safety” and “robustness”. They diagnose recurring issues in what tasks are chosen, how they’re scored, and how results are interpreted, then offer eight recommendations for building more valid benchmarks.


Why it matters:

  • Benchmarks are driving procurement and policy decisions, but this review shows that many popular ones have weak construct validity – they don’t actually measure the thing they’re used to justify.

  • The recommendations are directly actionable for enterprises: tie metrics to real operational risks and tasks, report uncertainty, and avoid over-generalising from narrow benchmarks.

  • For SG/SEA regulators, buyers and internal AI teams, this paper is a strong citation when you insist on fit-for-purpose evaluation frameworks instead of “leaderboard worship”.


3. Risks of LLMs in Education: From Superficial Outputs to Superficial Learning

Paper: From Superficial Outputs to Superficial Learning: Risks of Large Language Models in Education – Iris Delikoura et al.


This ACM paper synthesises 70 empirical studies on LLM use in education, covering both opportunities and risks. The authors introduce an LLM-Risk Adapted Learning Model that traces how technical risks (hallucinations, bias, privacy issues) cascade through interaction and interpretation into cognitive and behavioural effects like over-reliance, reduced independent learning and diminished student agency.


Why it matters:

  • It’s one of the first systematic, empirical reviews that goes beyond “we’re worried” to map which risks are actually observed in real deployments – from grading bias to reduced deep processing.

  • The paper explicitly highlights that less confident learners are more likely to use LLMs in shallow ways, increasing the equity gap, while stronger learners tend to use them as scaffolds for deeper understanding.

  • For ministries, universities, and training providers in SG, this is a must-read reference when designing GenAI-enabled learning environments, guardrails and educator training – especially as SkillsFuture and higher-ed institutions scale AI use.


Industry & Policy Moves


4. Data & Infrastructure: Agentic AI Moves into the Enterprise Stack


a) Snowflake Intelligence – Agentic AI on the AI Data Cloud

Snowflake announced Snowflake Intelligence, positioned as a way for enterprises to “bring agentic AI to the data” by combining its AI Data Cloud, vector search and new agent capabilities so organisations can deploy agentic AI at scale over governed enterprise data.

b) IBM Fusion + NVIDIA AI Data Platform – Deep Research in Healthcare

IBM announced that IBM Fusion is delivering one of the first implementations of NVIDIA’s AI Data Platform reference design, with UT Southwestern Medical Center as an early adopter. The setup combines NVIDIA GPUs and AI Enterprise software with IBM Fusion’s content-aware data services, which continuously process, index and vectorize unstructured data into “AI-ready” form for agents.

Use cases highlighted include:

  • Drug discovery via NVIDIA BioNeMo on top of Fusion.

  • AI-driven patient avatars for clinical training.

  • Helixa AI Assistant for BioHPC researcher support – all powered by agentic workflows over institutional data.


Why it matters:

  • Snowflake and IBM/NVIDIA are both explicitly marketing “agentic AI on your data” as core to their enterprise story, not just as optional add-ons.

  • The UT Southwestern deployment shows how agentic AI is being woven into deep research and clinical environments, with strong parallels to what large academic hospitals and universities in SG might do.

  • For buyers, this reinforces that the battle is shifting to who can offer the cleanest path from messy enterprise data → vectorised, governed, agent-ready fabric.


5. Sector Platforms: Agentic AI in Energy and Sports


a) SLB’s Tela – Agentic AI for Upstream Energy

SLB (formerly Schlumberger) launched Tela™, an agentic AI assistant aimed at upstream energy. Tela is embedded across SLB’s software, powered by the Lumi™ data and AI platform, and follows a five-step agentic loop: observe, plan, generate, act, learn. Agents can interpret well logs, predict drilling issues and optimise equipment performance, working either alongside humans or autonomously.

b) LALIGA & Globant – Agentic AI Pods for Sports Operations

LALIGA signed a memorandum with Globant to transform league operations using agentic AI Pods, subscription-based teams of AI agents orchestrated by Globant’s experts via the Globant Enterprise AI platform. Through their joint venture Sportian, they’re building agents across talent development, operations and technology – supporting performance analysis, personalised content and workflow automation for one of the world’s most watched sports leagues.


Why it matters:

  • Tela is a textbook example of an industry-specific agentic assistant: domain models + LLMs + workflows tightly integrated into existing tools, not a generic chat interface.

  • LALIGA/Globant’s AI Pods model shows how an organisation can subscribe to “teams of agents” as a service, which is directly relevant to sports, events, and media ecosystems in SG.

  • For regional energy, logistics and sports bodies, these are strong references when considering vertical agentic platforms rather than building from scratch.


6. Agentic Ecosystems & Regional Playbooks


a) HUMAIN ONE + EY MENA – Agentic AI for Corporate Functions

HUMAIN (backed by Saudi Arabia’s Public Investment Fund) and EY MENA announced a collaboration to embed EY’s proprietary AI business solutions into HUMAIN ONE, an agentic AI platform powered by ALLAM, a large Arabic language model. The plan is to turn EY assets across HR, tax, accounting, governance, corporate development and due diligence into HUMAIN ONE agents serving governments and enterprises, starting in Saudi Arabia.

The release explicitly frames this as part of a global shift toward agentic AI and cites projections of a multi-trillion-dollar market by 2030.

b) Microsoft & NVIDIA’s Agentic Launchpad – Startup Hub in the UK & Ireland

Microsoft and NVIDIA launched the Agentic Launchpad program in the UK and Ireland, aimed at startups building “agentic AI” – systems that can act, complete tasks and make decisions independently. As part of Microsoft’s US$30B UK AI investment, selected startups get Azure credits, NVIDIA startup support, engineering guidance and go-to-market help via Microsoft’s channels and marketplace.

c) Big Tech’s Capex Signal

Alongside these ecosystem moves, Meta’s announcement of up to US$600B in US AI data center spend over the coming years is a stark reminder of the capital intensity behind the AI infra race.


Why it matters:

  • HUMAIN + EY is a blueprint for combining a national LLM stack (ALLAM) with global consulting IP, turning classic service lines (tax, HR, audit, M&A) into agents that can be deployed inside client environments.

  • Microsoft/NVIDIA’s Launchpad shows how big vendors are building targeted support structures for agentic startups, not just generic cloud credits – something SG and regional agencies may want to mirror.

  • Meta’s planned spend underscores that access to hyperscale compute and data centres will remain a strategic bottleneck; regional strategies (including SG’s) have to account for where they plug into this stack.


Closing Thoughts


Across this week, three threads converge:

  • Reasoning and transparency are moving to the foreground: whether in urban planning, benchmark design, or education, there’s a clear push away from “black-box prediction” towards explicit reasoning, construct validity and empirically grounded risk models.

  • Agentic AI is crystallising as infrastructure – not just in Big Tech marketing, but in concrete deployments in healthcare, energy and sports, and in regional platforms like HUMAIN ONE.

  • Ecosystems and capex are becoming as important as models: Launchpads, national LLMs and massive data centre investments are defining who can actually build and run ambitious agentic systems.


For organisations in Singapore and Southeast Asia, the practical questions emerging from this week are:

  • Where do we need reasoning-capable, agentic systems (not just chatbots) – and how do we make their decisions explainable to planners, regulators and boards?

  • Which platform and data strategies put us closest to AI-ready, agent-ready infrastructure without locking us into brittle stacks?

  • How do we build evaluation, risk and education frameworks that keep pace – especially as we deploy LLMs into classrooms, upskilling programs and professional certification?


If you’d like to explore how these themes map onto your organisation’s roadmap – from urban planning and public services to sector-specific agentic pilots – AI Hub SG is happy to help you chart that next step.

 
 
 

Recent Posts

See All
AI Weekly Briefing – Week Ending 31 October 2025

Agentic AI moves from concept to concrete platforms This week, the AI landscape continued its shift from “chatbots and pilots” to agentic systems  that coordinate tools, workflows and even people. On

 
 
 

Comments


© Copyright | AI Hub | All Rights Reserved
bottom of page