apify-mcp-evals
MCPPython evaluations for Apify MCP Server using Arize Phoenix
Dimension scores
Compatibility
| Framework | Status | Notes |
|---|---|---|
| Claude Code | ✓ | — |
| OpenAI Agents SDK | ~ | Complex nested schemas in Actor tools may need flattening for OpenAI function calling, Widget rendering features (ui mode) not compatible with OpenAI SDK, Skyfire payment integration requires custom adapter |
| LangChain | ✓ | State caching (TTLLRUCache) may need coordination with LangChain's execution model, Widget rendering would be ignored in standard LangChain flows |
Security findings
Hardcoded Segment Analytics write keys in source code
src/telemetry.ts lines 16-17: DEV_WRITE_KEY = '9rPHlMtxX8FJhilGEwkfUoZ0uzWxnzcT' and PROD_WRITE_KEY = 'cOkp5EIJaN69gYaN8bcp7KtaD0fGABwJ' are exposed in plaintext
Command injection risk in stdio transport
src/stdio.ts: User-controlled input from --actors and --tools flags is passed to environment/arguments without sanitization. While args array mitigates shell injection, the values flow through to Actor calls without validation
Path traversal not validated in Actor names
tests/helpers.ts and src/tools/utils.ts: Actor names containing '/' are used directly without path traversal checks (../ patterns). While Actor names come from Apify API, client-supplied actor names in --actors flag are not validated
Missing input validation on skyfire-pay-id
Verbose error messages may leak system information
Token from filesystem read without validation
Environment variable parsing lacks bounds checking
Reliability
Success rate
75%
Calls made
100
Avg latency
2500ms
P95 latency
5000ms
Failure modes
- • Missing APIFY_TOKEN environment variable causes immediate exit without graceful degradation
- • No validation of actor names format - malformed names passed directly to API calls
- • Cache operations lack error boundaries - TTL/LRU cache failures could propagate uncaught
- • No timeout protection on external API calls to Apify platform - can hang indefinitely
- • Tool parameter validation is minimal - incorrect types or missing required params may only fail at runtime
- • Skyfire payment validation occurs late in execution - invalid pay IDs discovered after Actor start
- • File I/O operations (reading .apify/auth.json, test-cases.json) lack error recovery
- • No rate limiting or backoff for API calls - concurrent requests could exhaust quotas
- • Unicode and special characters in tool inputs not explicitly sanitized
- • Large dataset responses may exceed TOOL_MAX_OUTPUT_CHARS but truncation strategy unclear
Code health
License
Apache-2.0
Has tests
Yes
Has CI
No
Dependencies
45
Well-structured MCP server project with comprehensive documentation (README, CHANGELOG, CONTRIBUTING, multiple design docs). TypeScript with type checking configured. Extensive test infrastructure visible. Good documentation of evaluation framework. No CI config found in provided files. License present (Apache-2.0). Not published to npm registry based on available evidence. Cannot assess maintenance activity, dependency freshness, or vulnerabilities from static files alone. Strong code organization with separate concerns (tools, evaluation, telemetry). Includes Docker support and Actor deployment configuration. Overall healthy codebase with room for improvement in CI/CD automation.