What happens when LLMs don’t work as expected?

Unearthing Insights in the Built Environment

Sayjel Vijay Patel
Digital Blue Foam

--

By Sayjel Patel & Aleksei Kondratenko

How to adopt LLMs for serious use in the building industry?

As the buzz around ChatGPT wears off, architects, planners, and consultants are eager to explore the practical applications of Large Language Models (LLMs). With the promise of improved planning, optimized designs, fast and accurate cost estimates, and better stakeholder engagement, these models could revolutionize the industry. However, these opportunities do not come without risks.

Drawing from our experiences at DBF, this blog discusses 3 challenges for the building industry to adopt this fascinating technology.

Challenge 1 — You need a lot of up-to-date corporate data in the right Format

Companies are increasingly interested in using their proprietary data to create custom Large Language Models (LLMs) for the building industry, aiming to produce outputs that align more closely with their specific contexts and challenges. This endeavor requires expertise, computational power, and accurate, up-to-date corporate data. Currently, there are two main methods: ‘Fine-Tuning’ and Retrieval-Augmented Generation (RAG), both of which require extensive access to the company’s data to achieve optimal results.

Unlike areas where open-source data suffices, the building industry demands specific project data. This might seem straightforward, but in practice, it’s extremely challenging. Compiling project information from several years into a uniform format is not only technically demanding but also often faces insurmountable confidentiality constraints. Moreover, eliminating bias is a formidable task, as biases are embedded in every aspect of a project, from designer and client inputs to local regulations and standards.

So what?

  • Customized LLMs Need structured, accurate, and up-to-date corporate data, which is either unavailable or financially prohibitive.
  • It will require on-going costs for updating, managing, and storing data, to ensure accuracy of the models.

Challenge 2 — Overcoming made-up or biased results

Would you base a decision on the output of an LLM, knowing it might sometimes ‘hallucinate’ — producing inaccurate or unfounded outputs? This phenomenon presents a significant challenge in the building industry, where precision is essential for tasks ranging from code compliance checks to material specifications. Moreover, bias in model outputs is another critical issue; LLMs can reflect or even amplify biases from their training data, resulting in skewed outputs. Given these considerations, despite technological advancements, it will likely be some time before LLMs can be reliably used for critical decision-making tasks.

So what?

  • Delays and cost over-runs: Inaccuracies or discrepancies in the information provided by LLMs can lead to misunderstandings, project delays, and escalated costs.
  • Client presentation challenges: The inability to present biased or inaccurate results to clients undermines the transparency and credibility of descisions

Challenge 3— Unable explain reasoning or objective outputs

Trusting Large Language Models (LLMs) like GPT-4 can be complex without clear explanations for their decisions. LLMs face challenges with tasks requiring straightforward reasoning or sequence planning — a stark contrast to other types of AI, such as Reinforcement Learning (RL), which excel in these areas. Trust in LLMs depends on the context: they’re useful for general inquiries, but caution is necessary for critical decisions that demand precise logic. Ongoing research is focused on improving the transparency and reasoning capabilities of LLMs, with the goal of combining their vast knowledge with the problem-solving prowess of RL agents for a more comprehensive AI solution.

So what?

  • LLMs are unable ability to provide objective rationale to explain outputs, so it is not possible to explain how they arrived at an answer or decision

Next Steps

To fully harness the potential of Large Language Models (LLMs) in the building industry, a shift from viewing them as novel tools to essential assets is required. This transition involves tackling significant challenges: securing access to updated and structured data without compromising confidentiality, reducing the risks associated with inaccurate or ‘hallucinated’ information and built-in biases, and closing the gap in LLMs’ capability to explain their decision-making processes. By carefully assessing these risks and considering hybrid models that combine LLMs’ comprehensive knowledge with the problem-solving capabilities of other AI types, the industry can leverage LLMs not merely as convenient tools but as foundational components for pioneering building design and construction methodologies.

N O T E S

1 See “ PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change — K. Valmeekam, M. Marquez, A. Olmo, S. Sreedharan, S. Kambhampati — 37th NeurIPS — 2023”

--

--

Sayjel Vijay Patel
Digital Blue Foam

CTO of Digital Blue Foam and Founding Professor at the Dubai Institute of Design and Innovation. MIT M.Arch ‘15