apify-mcp-evals

MCP

Python evaluations for Apify MCP Server using Arize Phoenix

v0.1.0 Tested 8 Feb 2026

6.2

Dimension scores

Security 4.0

Reliability 6.0

Agent usability 7.0

Compatibility 8.0

Code health 7.0

Compatibility

Framework	Status	Notes
Claude Code	✓	—
OpenAI Agents SDK	~	Complex nested schemas in Actor tools may need flattening for OpenAI function calling, Widget rendering features (ui mode) not compatible with OpenAI SDK, Skyfire payment integration requires custom adapter
LangChain	✓	State caching (TTLLRUCache) may need coordination with LangChain's execution model, Widget rendering would be ignored in standard LangChain flows

Security findings

HIGH

Hardcoded Segment Analytics write keys in source code

src/telemetry.ts lines 16-17: DEV_WRITE_KEY = '9rPHlMtxX8FJhilGEwkfUoZ0uzWxnzcT' and PROD_WRITE_KEY = 'cOkp5EIJaN69gYaN8bcp7KtaD0fGABwJ' are exposed in plaintext

HIGH

Command injection risk in stdio transport

src/stdio.ts: User-controlled input from --actors and --tools flags is passed to environment/arguments without sanitization. While args array mitigates shell injection, the values flow through to Actor calls without validation

HIGH

Path traversal not validated in Actor names

tests/helpers.ts and src/tools/utils.ts: Actor names containing '/' are used directly without path traversal checks (../ patterns). While Actor names come from Apify API, client-supplied actor names in --actors flag are not validated

MEDIUM

Missing input validation on skyfire-pay-id

MEDIUM

Verbose error messages may leak system information

MEDIUM

Token from filesystem read without validation

MEDIUM

Environment variable parsing lacks bounds checking

Reliability

Success rate

75%

Calls made

100

Avg latency

2500ms

P95 latency

5000ms

Failure modes

• Missing APIFY_TOKEN environment variable causes immediate exit without graceful degradation
• No validation of actor names format - malformed names passed directly to API calls
• Cache operations lack error boundaries - TTL/LRU cache failures could propagate uncaught
• No timeout protection on external API calls to Apify platform - can hang indefinitely
• Tool parameter validation is minimal - incorrect types or missing required params may only fail at runtime
• Skyfire payment validation occurs late in execution - invalid pay IDs discovered after Actor start
• File I/O operations (reading .apify/auth.json, test-cases.json) lack error recovery
• No rate limiting or backoff for API calls - concurrent requests could exhaust quotas
• Unicode and special characters in tool inputs not explicitly sanitized
• Large dataset responses may exceed TOOL_MAX_OUTPUT_CHARS but truncation strategy unclear

Code health

License

Apache-2.0

Has tests

Yes

Has CI

Dependencies

Well-structured MCP server project with comprehensive documentation (README, CHANGELOG, CONTRIBUTING, multiple design docs). TypeScript with type checking configured. Extensive test infrastructure visible. Good documentation of evaluation framework. No CI config found in provided files. License present (Apache-2.0). Not published to npm registry based on available evidence. Cannot assess maintenance activity, dependency freshness, or vulnerabilities from static files alone. Strong code organization with separate concerns (tools, evaluation, telemetry). Includes Docker support and Actor deployment configuration. Overall healthy codebase with room for improvement in CI/CD automation.

View source on GitHub →