feat(copilot): stabilize concurrency and enhance terminal context management

- AI Concurrency: - Implemented a dedicated background event loop (ConnpyAILoop) in a separate thread for AI tasks to ensure thread safety and event loop affinity. - Added 'run_ai_async' utility to funnel all LiteLLM calls through the dedicated loop. - Implemented global 'cleanup()' for safe closure of sync/async LiteLLM sessions. - gRPC & Remote Sessions: - Enhanced 'NodeServicer' to identify command blocks within the terminal buffer using prompt regex/byte tracking. - Added support for selective context retrieval via 'context_start_pos' in the gRPC Interact stream. - Synchronized remote Copilot behavior by enriching questions with session history (last 5 queries) in 'NodeStub'. - Optimized token usage by cleaning 'node_info' metadata before AI transmission. - Terminal Context & Core: - Modified 'node.connect' to always initialize 'mylog' (BytesIO) buffer regardless of disk logging configuration, ensuring Copilot context availability. - Integrated 'ai.cleanup()' in CLI (connapp) and Server (api) exit points for graceful shutdowns. - Suppressed LiteLLM internal streaming coroutine warnings during task cancellation.
2026-05-11 12:30:43 -03:00
parent 1103393be6
commit dba7e24dda
7 changed files with 170 additions and 15 deletions
@@ -345,7 +345,17 @@ class NodeStub:
                        continue

                    active_buffer = get_active_buffer()
-                    request_queue.put(connpy_pb2.InteractRequest(copilot_question=question, copilot_context_buffer=active_buffer))
+                    # Enrich question with history (same as local CLI)
+                    past_questions = self.copilot_history.get_strings()
+                    if len(past_questions) > 1:
+                        # Limit history to last 5 questions to save tokens, excluding current
+                        recent_history = past_questions[-6:-1]
+                        history_text = "\n".join(f"- {q}" for q in recent_history)
+                        enriched_question = f"Previous questions in this session:\n{history_text}\n\nCurrent Question:\n{question}"
+                    else:
+                        enriched_question = question
+                        
+                    request_queue.put(connpy_pb2.InteractRequest(copilot_question=enriched_question, copilot_context_buffer=active_buffer))
                    
                    from rich.live import Live
                    live_text = "Thinking..."
@@ -800,7 +810,17 @@ class NodeStub:
                        continue

                    active_buffer = get_active_buffer()
-                    request_queue.put(connpy_pb2.InteractRequest(copilot_question=question, copilot_context_buffer=active_buffer))
+                    # Enrich question with history (same as local CLI)
+                    past_questions = self.copilot_history.get_strings()
+                    if len(past_questions) > 1:
+                        # Limit history to last 5 questions to save tokens, excluding current
+                        recent_history = past_questions[-6:-1]
+                        history_text = "\n".join(f"- {q}" for q in recent_history)
+                        enriched_question = f"Previous questions in this session:\n{history_text}\n\nCurrent Question:\n{question}"
+                    else:
+                        enriched_question = question
+                        
+                    request_queue.put(connpy_pb2.InteractRequest(copilot_question=enriched_question, copilot_context_buffer=active_buffer))
                    
                    from rich.live import Live
                    live_text = "Thinking..."