blueprint_llm_techspec-0a1b78 #549
Replies: 2 comments
-
|
This is a great technical document, thanks Umair. I think the SAM tool is straightforward, because it simply just highlight from the Yolo spatial hints. So as long as LLM gives correct instruction and Yolo does the right extraction, SAM will be mostly accurate. Am I right? So we should not be worried too much about the SAM tool? Q1: The current workflow fits the Q&A styles. We also have the "quantity widget" am I right? Eventually estimators will export the quantity widget as final takeoff? Should we define how this system will write to the "quantity widget", e.g this is the official results that estimators will extract to cvs/pdf file? Q2: When LLM call the tools, it will give instruction like "object=LIGHTS, location=KITCHEN" or "object=OUTLET, color=RED". What if the users don't give enough details in the instruction, such as -- Count lights - would LLM understand Location = full current page. Or instruction like Highlight outlets, will LLM just determine a color? I mean, will the LLM smart enough to fill the instruction and/or clarify users to fill the instructions. Q3: Does the LLM simply pass the instruction to the tools? "How many lights in kitchen?" Lights can be recessed or chandelier, who determine what kind of lights to counts? Q4: I am still unsure who read the legends? Seem like Yolo is reading the Legend. As soon as the floorplan is uploaded, Yolo will read the blueprint directly and produce all outcomes in a cache. And LLM calls will just retrieve the results from the catches? Q5: In the "2.2.1 Classes Detected", is it pre-defined? For example, there might be different Recess Lights that are in the legend, will Yolo knows it, or will only detect Classes? |
Beta Was this translation helpful? Give feedback.
-
|
A1 - Yes, the same cached counts/areas that power Q&A are the system of record and will feed the quantity widget. The spec should add an explicit “write quantities from cache → widget → CSV/PDF export” step so estimators treat that as the official takeoff. A2 – The LLM always sends fully specified arguments to tools; when users omit details it applies safe defaults (e.g., location = current page, color = system default) or asks a clarification question if that would materially change the result. This behavior should be documented in the spec so UX and estimators know how vague prompts are resolved. A3 – The LLM does not infer light types from text; it queries the YOLO cache, where each detection is already labeled (RECESSED, SURFACE, CHANDELIER, etc.). By default “lights” aggregates all light subclasses, but the spec can define configurable profiles (e.g., “count only recessed lights” vs “all fixtures”) so estimators know exactly what is included. A4 - Currently its on GPT Vision but for more accuracy when legend interpretation should happen once at upload: YOLO + a small LLM pass map legend symbols (H1, CT, etc.) to semantic classes and store them in the blueprint cache. All later LLM calls, including Q&A and highlighting, only read from this cache and never re-scan the raw PDF, which keeps responses fast and consistent. A5 – Yes, section 2.2.1’s class list is a fixed label set baked into the YOLO model, including any recessed-light variants we choose to train on. If the legend introduces a symbol that is not mapped to one of these classes, YOLO will not detect it until we extend the label set, add training data, and retrain the model. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
blueprint_llm_techspec-0a1b78
Best way to share your markdown files.
https://mdshare.vercel.app/blueprint_llm_techspec-0a1b78
Beta Was this translation helpful? Give feedback.
All reactions