Design Proposal: Client-Assisted Staleness Detection & Amendment System for Member Data
Author: Stephen Tan
Date: June 28, 2025
Status: Client Side is to be implemented
Scope: Applies to event.members
and chat.members
user data
NOTE:
- This is the concept on how data will be kept in sync in CaughtUp.app
- This is an abstracted overview of the system.
- Data is denormalized across the system for speed!
Problem
User profile data (e.g., display name, avatar) is duplicated across:
- Friends nodes in
/users/{userId}/friends
- Members collections in
/events/{eventId}/members
and/chats/{chatId}/members
Updating every instance on user profile change causes:
- Redundant writes for stale/unvisited resources (e.g., inactive chats, old events)
- Unnecessary reads and comparisons on already synced data
Backend Data Structure Overview
User profile data (e.g. display name, photoURL) is denormalized across multiple collections:
/
├── users/
│ └── {userId}/
│ └── friends/
│ └── {friendId}: {displayName, photoURL, ...}
├── chats/
│ └── {chatId}/
│ └── members/
│ └── {userId}: {displayName, photoURL, ...}
├── events/
│ └── {eventId}/
│ └── members/
│ └── {userId}: {displayName, photoURL, ...}
When a user updates their profile, data in /users/{userId}
and /users/{friendId}/friends/{userId}
is updated immediately, but duplicated values in every event or chat they participated in are not.
With hundreds of references per user, updating them all reactively (on every change) could be expensive. The challenge is balancing:
• Data freshness in visible or relevant places • System-wide performance and bandwidth
Design Goal
Deliver a scalable, performant strategy for keeping profile metadata in chat.members
and event.members
up-to-date without unnecessary compute.
Proposal: Client-Assisted Amendment Flow
Clients detect stale user data using local knowledge from their /friends
nodes. If the client detects older or mismatched user data when fetching event.members
or chat.members
, it flags those user IDs to an amendment endpoint.
The backend then asynchronously:
- Validates whether data is stale.
- Updates outdated member documents (only if necessary).
This avoids wasteful global updates and server reads unless there’s a real discrepancy.
Sequence Diagram (Mermaid)
This diagram separates the server into microservices:
- GetAPI: Handles event member fetches
- AmendmentAPI: Flags member data as stale
- AmendmentWorker: Listens for amendments, validates, and updates data
- Database: Stores all data
Operational Flow (Event/Chat Load)
- Client calls
getEventMembers(eventId)
orgetChatMembers(chatId)
via the GetAPI microservice on the server. - The GetAPI fetches member data from the Database and returns it to the client.
- The client compares each
member.displayName
/photoURL
with its localfriends[]
list. - For any mismatches (suspected stale data), the client sends the affected user IDs to the server’s AmendmentAPI (
POST /amendUserData
). - The AmendmentAPI marks the relevant member records as stale in the Database.
- The AmendmentWorker (a server-side function) listens for new stale flags, then:
- Fetches the flagged member data from the Database
- Fetches the source-of-truth user data from
/users/{userId}
- Compares the two and updates the member document with fresh user data if needed
- The client will see updated data on the next load or via live sync.
Options Summary
🔁 Option 1: Real-Time Comparison on Fetch
When getChatMembers
or getEventMembers
is called, the server compares all member data with /users/{userId}
and performs immediate updates if there are mismatches.
Pros:
- Guarantees the freshest data immediately
- Simple comparison logic
Cons:
- High per-call latency
- Performs unnecessary reads/writes if data hasn’t changed
- Painful for large groups
⏳ Option 2: Timestamp-Gated Background Sync
On each getMembers
call, if a member’s lastSyncedAt
is older than X days (e.g., 3), the server adds the member to an update queue. A background task compares and updates after the fact.
Pros:
- Offloads heavy work to async queue
- Ensures eventual consistency for stale records
Cons:
- Still scans all members per call, just defers work
- Generates queue events even if no data has changed
- Nearly as costly as Option 1 for large data sets
📦 Option 3: Client-Assisted Amendment (Proposed)
Clients compare member.user
data from chats/events with cached friend data and flag mismatches to the server via POST /amendUserData
. The server queues validation tasks and updates only if needed.
Pros:
- Reduces backend load significantly
- Focuses updates only on likely-stale users
- Seamless integration with client-side experience
Cons:
- Relies on clients being up-to-date and honest
- Cannot detect staleness for non-friend members
- Requires deduplication and rate-limiting logic
🚀 Option 4: Eager Propagation on Profile Update
When a user updates their profile, the server automatically locates all events and chats they appear in and updates all corresponding member
entries with the new data.
Pros:
- Ensures all references are always up-to-date
- No runtime comparisons or queues needed later
- Straightforward update pipeline on change
Cons:
- Extremely write-heavy, even for outdated or unused documents
- Cost grows linearly with number of chats/events per user
- May update cold data unnecessarily (e.g., a 2-year-old chat)
Option Comparison Table
Factor | Option 1 | Option 2 | Option 3 | Option 4 |
---|---|---|---|---|
Freshness Guarantee | ✅ Strong | ⚠️ Deferred | ⚠️ Targeted | ✅ Strong |
Backend Load | ❌ High | ❌ High | ✅ Low | ❌ High |
Client Dependency | None | None | ⚠️ Friend-based only | None |
Latency on Load | ❌ High | ✅ Low | ✅ Low | ✅ Low |
Update Scope | Full | Full (stale) | Partial (flagged) | Full |
Write Amplification | ✅ Only when needed | ❌ On any stale | ✅ Only flagged | ❌ On every change |
Design Guardrails
- Deduplication: Client flags stored per userId + context (chat/event) to prevent spam or redundant flags.
- Rate Limiting: Limit how often the same member is queued for update.
- Server Authority: Even though the client flags staleness, only the server compares values and updates.
- TTL-Based Override (Optional): If a member doc is older than X days, it may be auto-queued on access even without a client mismatch.
Future Extensions
- Add ETag or version hashes to reduce need for full field comparisons.
- Use batch sync jobs for high-activity chats/events to reduce network chatter.
- Tie amendment logic to user profile changes (i.e., proactively queue some updates for close contacts).
Conclusion & Recommendation
- Option 1 and Option 2 provide strong data freshness guarantees but do not effectively reduce overall server load—they primarily shift the burden.
- Option 4 ensures immediate consistency but is not sustainable at scale due to high write amplification.
- Option 3 is preferred for CaughtUp: it leverages client-side knowledge, minimizes unnecessary backend reads, and fits well with the app’s trust-driven architecture.
Recommendation:
Adopt Option 3 as the primary strategy. For high-visibility data (such as close friends, hosts, or participants in active chats/events), selectively apply Option 4 to ensure immediate consistency where it matters most.
This approach strikes a solid balance: minimal server overhead, user-informed freshness cues, and protected consistency via backend enforcement.
Journal
- 2025-06-28 Created file