AI

Preparing for Coexistence: Comparing AI Employee Peers on Human Performance Benchmarks

Author

Shikhar Mishra

Shikhar Mishra

Date Published


We are living through one of the most consequential periods in workforce history. The future of enterprise workforces will feature a hybrid of human and AI talent modalities, working harmoniously to unlock unprecedented business value.

This multi-modal talent force will have common responsibilities and performance evaluation criteria, ensuring fairness and transparency.

The best way for humans to prepare for and feel fair about coexisting with our new AI peers is to ensure that these peers are held to the same performance benchmarks as us. Then, it is a fair fight for excellence.

AI employee peers on human performance benchmarks.jpg


What Leading Indicators Drive My Coexistence Hypothesis?

I see large language models (LLMs) writing more effective code from a testability, readability, and maintainability POV than I did three years out of college. In most of my daily coding tasks, I now play the role of a reviewer/architect, adding strategic value. This reminds me of the work I used to do for architects and principal engineers earlier in my career. This trend is not restricted to engineering discipline; I see similar patterns emerge in other, non-software engineering job functions as well.

What Gives Me the Authority to Benchmark My AI Peers?

Career Ladder Growth: I grew from a Software Engineering Intern to a Vice President of Engineering. My first task was to fix an inconsequential bug in C#. My current tasks impact millions of dollars in revenue. I’ve seen firsthand the unique performance objectives of each level of the ladder.

Sample Size: The first production code I shipped was in 2004, 20 years ago. These two decades have given me a sample size of millions of lines of code (LOC), thousands of human employee peers, and hundreds of business scenarios in both enterprises and startups.

Vantage Point: Engineering executives have a unique vantage point. They are exposed to the performance criteria of software engineers and non-technical stakeholders, from GTM to Ops. This unique perspective is crucial in understanding AI-peer coexistence in enterprise workforces.

What Will Be My Benchmarking Criteria?

I will keep it simple. Complexity hurts understandability. The goal is to develop understanding in pursuit of harmonious and force-multiplying coexistence.

The Rule of Threes:

For each organizational role (Engineering, GTM, Ops) that a human employee currently holds:

1. Pick three job responsibilities

2. Pick three levels of performance ratings

3. Pick three feedback sources (self, manager, and peers)

I will then have AI employees go through the 6-month performance review cycle for their respective roles.

I will stop here. You can read more about the benchmarks by joining The Outlier Engineer

BTW, if you have a specific function that you would like to be evaluated against its AI-employee peer, please reach out to me at shikhar@alfred.sh.