Self-supervised learning (SSL) has emerged as a powerful paradigm for extracting biologically meaningful representations from single-cell data, yet systematic guidelines for single-cell SSL applications remain lacking. Here, we present a comprehensive benchmark, scSSL-Bench, evaluating twelve SSL methods across eight datasets and three critical downstream tasks: batch correction, cell-type annotation, and missing modality prediction. We furthermore systematically assess various data augmentation strategies. Our analysis reveals task-specific trade-offs: the specialized single-cell frameworks, scVI and CLAIRE, excel at uni-modal batch correction, while general SSL methods, VICReg and SimCLR, demonstrate superior performance in cell-typing task and multi-modal data integration. Random masking emerges as the most effective augmentation technique across all tasks. Notably, our results indicate the need for a specialized single-cell multi-modal data integration framework. scSSL-Bench provides a standardized evaluation platform and concrete recommendations for applying SSL to single-cell analysis, advancing the convergence of deep learning and single-cell genomics.