meta's ai benchmark controversy: are maverick model comparisons misleading?