AI-Powered CPA Compliance

POC Implementation Plan (6-8 Weeks)

Objective

Develop a focused Proof of Concept (POC) to automate client service verification against agreements, using SFTP connectivity, local LLMs with single-step processing, and a vector database-powered chat interface for interactive querying.

POC Summary

Data Extraction

  • Bruce runs SQL queries and extracts relevant data.
  • Engagement agreements and CSV files are collected into a shared folder.

AI Processing

  • AI will process engagement agreements, extracting key service commitments.
  • AI will cross-check extracted data with SQL records.
  • AI will classify findings into categories (Completed, Missing, or Unconfirmed).

Report Generation

  • AI generates a structured checklist indicating compliance or missing elements.
  • The report will be formatted as a structured table.

Week 1-2: SFTP Setup & Data Ingestion

  • Develop SFTP connector for secure file retrieval
  • Implement CSV parser for structured data extraction
  • Set up PDF parser for engagement agreement processing
  • Create data validation and error handling mechanisms
  • Implement storage system for downloaded files and processed data
  • Test and refine data ingestion process

Week 3-4: LLM Testing & Workflow Development

  • Set up Ollama for local LLM deployment
  • Prepare test suite with sample agreements and CSV data
  • Test multiple LLMs (e.g., Llama 3.1, Llama 3.2, Mistral small, Phi-4)
  • Evaluate LLM performance on service extraction and matching tasks
  • Design and implement single-step processing workflow
  • Optimize workflow based on testing results

Week 5: AI Processing & Classification

  • Implement service commitment extraction from agreements using selected LLM and workflow
  • Develop mechanism for cross-checking extracted data with CSV records
  • Create classification system for findings (Completed, Missing, Unconfirmed)
  • Implement example classification: Check Payroll field in CSV to classify Payroll service as Completed (if present) or Missing (if blank)
  • Test and refine AI processing and classification system

Week 6: Report Generation & Vector DB Setup

  • Implement structured checklist generation for compliance reporting
  • Design and implement structured table format for the report
  • Set up PostgreSQL database with pgVector extension for storing processed data
  • Develop data insertion pipeline for storing processed information in the vector database
  • Test report generation and vector database functionality

Week 7: Chat Interface Development

  • Implement chat interface for interacting with the vector database
  • Create query processing system for translating user questions into vector searches
  • Develop response generation mechanism using retrieved vector data
  • Implement context-aware follow-up question handling
  • Test and refine chat interface functionality

Week 8: System Integration & Final Testing

  • Integrate all components of the system
  • Perform end-to-end testing of the entire system
  • Optimize performance and fix any identified issues
  • Conduct user acceptance testing
  • Prepare final documentation and user guide
  • Deliver POC system and documentation