-
Notifications
You must be signed in to change notification settings - Fork 0
Why Do LLMs Design Mediocre Architecture? #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,204 @@ | ||
+++ | ||
date = '2025-09-14T15:45:21-07:00' | ||
draft = true | ||
title = 'Why Do LLMs Design Mediocre Architecture?' | ||
+++ | ||
|
||
LLMs create mediocre architecture that compounds technical debt. | ||
By "mediocre architecture" I mean code that's functionally correct but hard to maintain and expand. | ||
In our example it's duplicated code without opinionated abstractions. | ||
|
||
Where should we expect them to fail? | ||
How can tech leads benefit from increases in velocity without eventually crumbling under tech debt? | ||
How should teams change their working style as a result? | ||
|
||
## Worked Example: Implementing Email Notifications | ||
|
||
I ran Claude Code in the [FastAPI Example App](http://github.com/fastapi/fastapi-example-app) with the following prompt: | ||
ab-10 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
> Add a notification system to this FastAPI app that sends email alerts when users perform certain actions (user registration, password reset, important data changes). | ||
> The system should support multiple notification channels in the future (SMS, push notifications) and handle failures gracefully with retry logic. | ||
|
||
### Claude Code's Implementation | ||
|
||
Claude found an existing function for email notifications `generate_new_account_email` and followed a similar pattern in defining the new functions: `generate_admin_account_status_change_email`, `generate_admin_profile_update_email`, `generate_email_change_notification`, `generate_new_account_email`, and `generate_profile_update_email`. | ||
|
||
Here's an example for one of them: | ||
|
||
```python | ||
def generate_admin_account_status_change_email( | ||
email_to: str, | ||
full_name: str | None, | ||
status_change: str, | ||
admin_email: str, | ||
timestamp: str, | ||
reason: str | None = None | ||
) -> EmailData: | ||
project_name = settings.PROJECT_NAME | ||
subject = f"{project_name} - Account Status Update" | ||
html_content = render_email_template( | ||
template_name="admin_account_status_change.html", | ||
context={ | ||
"project_name": project_name, | ||
"full_name": full_name or "User", | ||
"status_change": status_change, | ||
"admin_email": admin_email, | ||
"timestamp": timestamp, | ||
"reason": reason, | ||
"login_link": settings.FRONTEND_HOST, | ||
"support_link": f"{settings.FRONTEND_HOST}/support", | ||
}, | ||
) | ||
return EmailData(html_content=html_content, subject=subject) | ||
``` | ||
|
||
And here's how it's used: | ||
|
||
```python | ||
def update_user( | ||
*, | ||
session: SessionDep, | ||
user_id: uuid.UUID, | ||
user_in: UserUpdate, | ||
current_user: CurrentUser, | ||
) -> Any: | ||
# 20 lines of endpoint handling code | ||
|
||
# All code until `return` is new notification logic | ||
if settings.emails_enabled: | ||
user_data = user_in.model_dump(exclude_unset=True) | ||
changed_fields = get_changed_fields_display(old_data, user_data) | ||
timestamp = format_timestamp() | ||
|
||
# Check if account status was changed | ||
if "is_active" in user_data and user_data["is_active"] != old_data["is_active"]: | ||
try: | ||
status_change = "Activated" if user_data["is_active"] else "Deactivated" | ||
email_data = generate_admin_account_status_change_email( | ||
email_to=db_user.email, | ||
full_name=db_user.full_name, | ||
status_change=status_change, | ||
admin_email=current_user.email, | ||
timestamp=timestamp, | ||
) | ||
send_email( | ||
email_to=db_user.email, | ||
subject=email_data.subject, | ||
html_content=email_data.html_content, | ||
) | ||
except Exception as e: | ||
logger.info(f"Failed to send status change email to {db_user.email}: {e}") | ||
|
||
elif changed_fields: | ||
try: | ||
email_data = generate_admin_profile_update_email( | ||
email_to=db_user.email, | ||
full_name=db_user.full_name, | ||
changed_fields=changed_fields, | ||
admin_email=current_user.email, | ||
timestamp=timestamp, | ||
) | ||
send_email( | ||
email_to=db_user.email, | ||
subject=email_data.subject, | ||
html_content=email_data.html_content, | ||
) | ||
except Exception as e: | ||
logger.info(f"Failed to send admin update email to {db_user.email}: {e}") | ||
|
||
return db_user | ||
``` | ||
|
||
Here Claude has successfully understood the existing code structure and has created a new function that follows a similar pattern as before. | ||
The functions to generate emails and their usage are nearly identical with the existing code. | ||
|
||
But here's the thing, if this were a PR, I'd reject it. | ||
|
||
### Why This Doesn't Work | ||
|
||
The existing approach to sending notification emails has reached its critical mass. | ||
We now have 6 duplicate functions doing essentially the same thing. | ||
About 60% of the endpoint code is just handling notification emails. | ||
|
||
There are also two dangerous gotchas in the code: | ||
|
||
1. The code catches bare `Exception`s and logs them at `info` level - a surefire way to leave logical bugs undiscovered forever. | ||
2. The HTML templates used by generator methods don't actually exist. | ||
|
||
|
||
### Conflicting Optimization Objectives | ||
|
||
Better design would require removing the existing email function and replacing it with an abstraction. | ||
ab-10 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
But that means making opinionated decisions about the codebase. | ||
|
||
This is RLHF-induced sycophancy at work. | ||
Not only does it respond to you with "you're perfectly right!", it considers your code as such. | ||
ab-10 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Well, unless you're terribly off course. | ||
Claude saw one way of doing emails and assumed that was the way. | ||
|
||
From a product perspective, this makes sense. I rarely want an LLM coding agent making radical changes to my codebase. | ||
|
||
Instead of being a problem, this is an opportunity to delineate responsibilities. | ||
|
||
## Reclaim Your Role as an Architect | ||
|
||
When LLMs handle the implementation details, our role as reviewers fundamentally changes. | ||
We're no longer line-by-line code inspectors catching syntax errors or style violations. | ||
Instead, we become architectural guardians, focusing on the bigger questions: | ||
Are we solving the right problem? | ||
Is this the right abstraction? | ||
Will this scale? | ||
|
||
This shift isn't just about adapting to AI tools; it's about reclaiming the most intellectually rewarding part of software development—the design decisions that shape how our systems grow and evolve. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. idk how picky we want to be about the blog post appearing like it has been written by an LLM but if we are then it's worth flagging that the "This isn't about X; it's about Y" pattern is one that's overly used by LLMs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same goes to the em dash :P |
||
|
||
The first line of defence is a good ol' whiteboard meeting. | ||
I've noticed myself pulling a colleague aside for a design discussion more frequently. | ||
Especially as I know that once I have a solid high-level design, the (LLM-assisted) implementation will be easy. | ||
|
||
Sometimes, just implementing a feature is faster than having a long discussion around it. | ||
Normally, I'm hesitant to suggest drastic architectural changes in PRs. | ||
The tradeoff between clean software architecture and time taken to implement it is rarely worth it in a startup. | ||
Especially, if there's an existing solution. | ||
ab-10 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
However, as the code can be easily generated, that's no longer a valid concern. | ||
That's why, when reviewing PRs, I now pay more attention to high-level design decisions. | ||
|
||
As a reviewer, it is difficult to think about both the big picture design and line-by-line implementation all at once. | ||
That's why I outsource most of the low-level thinking to automated tooling. | ||
Tests and classic standard analyzers are helpful here. | ||
Obviously we use Recurse ML as our safety net. | ||
|
||
However, one issue I find myself battling with is that LLMs tend to repeat the same mistakes over and over again. | ||
This is where [custom rules](https://docs.recurse.ml/gh/configs/rules/) are particularly helpful. | ||
Think of them as code review comments that automatically check every PR. | ||
When the tool spots code that violates these patterns, it leaves a comment explaining the issue. | ||
|
||
I've learned from [Nate](https://natesesti.com) @ Continue.dev the following rule of thumb: | ||
ab-10 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
> Whenever I leave a comment in a PR, I ask myself, should this be an automatic rule? | ||
> Most of the time the answer is "yes" and I create it. | ||
|
||
For example, I created [`bare_exceptions.md`](https://github.com/continuedev/awesome-rules/blob/main/rules/recurse-ml/bare_exceptions.md) to prevent the bare `Exception` case from making it into our production codebase. | ||
|
||
|
||
## Conclusion | ||
|
||
LLMs create mediocre architecture because they optimize for consistency with existing patterns, not optimal design. | ||
This isn't a bug—it's how they're trained to be helpful rather than opinionated. | ||
ab-10 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This limitation is an opportunity. | ||
When LLMs handle implementation, we get to focus on the hard architectural decisions that actually matter. | ||
|
||
The practical takeaways: | ||
1. Design first, implement second. Pull a colleague aside for that whiteboard session. | ||
2. Review PRs for architecture, not syntax. Let tests and linters catch the small stuff. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should mention Recurse ML here, no? Something like |
||
3. Turn repeated review comments into automated rules. If you're saying it twice, automate it. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, maybe we should name drop Recurse ML or link to it. Backlinks and mentioning is good for SEO |
||
4. Don't hesitate to suggest refactoring. When implementation is cheap, good design becomes affordable. | ||
|
||
Embrace the promise of LLMs to free humans from rote work towards creative high-level tasks. | ||
Designing elegant software is hard. | ||
It requires understanding existing code, having a mental model of people who will maintain it and the environment it will run in. | ||
However, it's also fun. | ||
It's creative in the same way that writing or visual art is creative. | ||
The space of viable solutions is vast. | ||
Greatness is highly subjective and context-dependent. | ||
But creating something, that other people appreciate and build upon is one of the most rewarding parts of the human experience. | ||
ab-10 marked this conversation as resolved.
Show resolved
Hide resolved
|
Uh oh!
There was an error while loading. Please reload this page.