feat(copilot): stabilize concurrency and enhance terminal context management

- AI Concurrency:
  - Implemented a dedicated background event loop (ConnpyAILoop) in a separate thread for AI tasks to ensure thread safety and event loop affinity.
  - Added 'run_ai_async' utility to funnel all LiteLLM calls through the dedicated loop.
  - Implemented global 'cleanup()' for safe closure of sync/async LiteLLM sessions.

- gRPC & Remote Sessions:
  - Enhanced 'NodeServicer' to identify command blocks within the terminal buffer using prompt regex/byte tracking.
  - Added support for selective context retrieval via 'context_start_pos' in the gRPC Interact stream.
  - Synchronized remote Copilot behavior by enriching questions with session history (last 5 queries) in 'NodeStub'.
  - Optimized token usage by cleaning 'node_info' metadata before AI transmission.

- Terminal Context & Core:
  - Modified 'node.connect' to always initialize 'mylog' (BytesIO) buffer regardless of disk logging configuration, ensuring Copilot context availability.
  - Integrated 'ai.cleanup()' in CLI (connapp) and Server (api) exit points for graceful shutdowns.
  - Suppressed LiteLLM internal streaming coroutine warnings during task cancellation.
This commit is contained in:
2026-05-11 12:30:43 -03:00
parent 1103393be6
commit dba7e24dda
7 changed files with 170 additions and 15 deletions
+22 -2
View File
@@ -345,7 +345,17 @@ class NodeStub:
continue
active_buffer = get_active_buffer()
request_queue.put(connpy_pb2.InteractRequest(copilot_question=question, copilot_context_buffer=active_buffer))
# Enrich question with history (same as local CLI)
past_questions = self.copilot_history.get_strings()
if len(past_questions) > 1:
# Limit history to last 5 questions to save tokens, excluding current
recent_history = past_questions[-6:-1]
history_text = "\n".join(f"- {q}" for q in recent_history)
enriched_question = f"Previous questions in this session:\n{history_text}\n\nCurrent Question:\n{question}"
else:
enriched_question = question
request_queue.put(connpy_pb2.InteractRequest(copilot_question=enriched_question, copilot_context_buffer=active_buffer))
from rich.live import Live
live_text = "Thinking..."
@@ -800,7 +810,17 @@ class NodeStub:
continue
active_buffer = get_active_buffer()
request_queue.put(connpy_pb2.InteractRequest(copilot_question=question, copilot_context_buffer=active_buffer))
# Enrich question with history (same as local CLI)
past_questions = self.copilot_history.get_strings()
if len(past_questions) > 1:
# Limit history to last 5 questions to save tokens, excluding current
recent_history = past_questions[-6:-1]
history_text = "\n".join(f"- {q}" for q in recent_history)
enriched_question = f"Previous questions in this session:\n{history_text}\n\nCurrent Question:\n{question}"
else:
enriched_question = question
request_queue.put(connpy_pb2.InteractRequest(copilot_question=enriched_question, copilot_context_buffer=active_buffer))
from rich.live import Live
live_text = "Thinking..."