Computer Science Phd Student at UCLA
Original Works
Publications
-
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning.
Colorado Reed*, Shufan Li*, Ritwik Gupta*, Sarah Brockman, Christopher Funk, Brian Clipp, Kurt Keutzer, Salvatore Candido, Matt Uyttendaele, Trevor Darrell
Scale-Aware representation learning with prior information of image scale.
-
Hierarchical Open-vocabulary Universal Image Segmentation.
Xudong Wang*, Shufan Li*, Konstantinos Kallidromitis*, Yusuke Kato, Kazuki Kozuka, Trevor Darrell
Segmentating arbritrary objects and object parts using text prompts
-
xT: Nested Tokenization for Larger Context in Large Images .
Ritwik Gupta*, Shufan Li*, Tyler Zhu*, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam
Long context visual perception on large images.
-
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data.
Shufan Li*, Harkanwar Singh, Aditya Grover
Modeling multi-dimensional data with linear complexity.
-
Aligning Diffusion Models by Optimizing Human Utility.
Shufan Li*, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, Kazuki Kozukar
Aligning text-to-image models with human feedback.
Preprints
-
PopAlign: Population-Level Alignment for Fair Text-to-Image Generation
Shufan Li*, Harkanwar Singh, Aditya Grover
Aligning Diffusion Model For Fairness
-
SegLLM: Multi-round Reasoning Segmentation
XuDong Wang*, Shaolun Zhang*, Shufan Li*, Konstantinos Kallidromitis, Kehan Li, Yusuke Kato, Kazuki Kozuka, Trevor Darrell
Multi-round interactive Segmention using Large Language Model
-
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following.
Shufan Li*, Harkanwar Singh, Aditya Grover
Image editing following multi-modal instructions.
-
Refine and Represent: Region-to-Object Representation Learning
Akash Gokul*, Konstantinos Kallidromitis*, Shufan Li*, Yusuke Kato, Kazuki Kozuka, Trevor Darrell, Colorado J Reed
Representation learning benefits by first learning from image regions and then learning from actual objects.
Technical Reports
-
Interpreting Audiograms with Multi-stage Neural Networks
Shufan Li*, Congxi Lu, Linkai Li,Jirong Duan, Xinping Fu, Haoshuai Zhou
Accelerate Hearing Aid Fitting using Computer Vision
-
Chart-RCNN: Efficient Line Chart Data Extraction from Camera Images
Shufan Li*, Congxi Lu, Linkai Li, Haoshuai Zhou
Line Chart Data Extraction in the wild.