Agentic Robot Scene Risk Analyzer

Overview of the robot scene analyzer architecture

This prototype presents an end-to-end agentic pipeline that converts raw LiDAR point clouds into explainable robot navigation decisions. The system ingests .pcd/.ply point cloud files alongside YAML scene metadata, and processes them through four orchestrated stages:

Perception: An Open3D-based module segments the ground plane, clusters obstacle geometry, and produces a structured semantic scene summary including object positions, regions, and free-space estimates.
RAG (Retrieval-Augmented Generation): A LangChain retrieval layer backed by a FAISS vector index queries an in-repo safety and planner knowledge base to surface relevant operational context for the current scene.
State Extraction — MCP-style tool functions parse the YAML metadata to extract robot pose, ego speed, and the positions of nearby actors, feeding structured state into the decision layer.
Coordinator: A rule-based coordinator fuses perception output, retrieved context, and robot state into a final navigation decision. When an OpenAI key is available, a LangGraph-orchestrated LLM-backed path is attempted first, with automatic fallback to rule-based logic.

The system outputs a risk level (Low / Medium / High), a recommended action, a scene assessment, and supporting evidence. A Streamlit front-end exposes an interactive 3D Plotly visualization of the scene alongside all intermediate outputs, making the pipeline fully inspectable at every stage.

Stack: Python · Open3D · LangGraph · LangChain · FAISS · OpenAI API · Streamlit · Plotly