Computer Science Phd Student at UCLA
Original Works
Publications
-
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning.
Colorado Reed*, Shufan Li*, Ritwik Gupta*, Sarah Brockman, Christopher Funk, Brian Clipp, Kurt Keutzer, Salvatore Candido, Matt Uyttendaele, Trevor Darrell
Scale-Aware representation learning with prior information of image scale.
-
Hierarchical Open-vocabulary Universal Image Segmentation.
Xudong Wang*, Shufan Li*, Konstantinos Kallidromitis*, Yusuke Kato, Kazuki Kozuka, Trevor Darrell
Segmentating arbritrary objects and object parts using text prompts
-
xT: Nested Tokenization for Larger Context in Large Images .
Ritwik Gupta*, Shufan Li*, Tyler Zhu*, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam
Long context visual perception on large images.
Preprints
-
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data.
Shufan Li*, Harkanwar Singh, Aditya Grover
Modeling multi-dimensional data with linear complexity.
-
Aligning Diffusion Models by Optimizing Human Utility.
Shufan Li*, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, Kazuki Kozukar
Aligning text-to-image models with human feedback.
-
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following.
Shufan Li*, Harkanwar Singh, Aditya Grover
Image editing following multi-modal instructions.
-
Refine and Represent: Region-to-Object Representation Learning
Akash Gokul*, Konstantinos Kallidromitis*, Shufan Li*, Yusuke Kato, Kazuki Kozuka, Trevor Darrell, Colorado J Reed
Representation learning benefits by first learning from image regions and then learning from actual objects.
Technical Reports
-
Interpreting Audiograms with Multi-stage Neural Networks
Shufan Li*, Congxi Lu, Linkai Li,Jirong Duan, Xinping Fu, Haoshuai Zhou
Accelerate Hearing Aid Fitting using Computer Vision
-
Chart-RCNN: Efficient Line Chart Data Extraction from Camera Images
Shufan Li*, Congxi Lu, Linkai Li, Haoshuai Zhou
Line Chart Data Extraction in the wild.