Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Zixuan Jiang's picture
1 4 2

Zixuan Jiang

Andrew0425
dark-pen's profile picture
·
https://anxmuy.github.io/
  • AnXMuy

AI & ML interests

Vision Language Model, Multimodel Language Model, Remote sensing

Recent Activity

upvoted a paper 2 days ago
Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation
submitted a paper 2 days ago
Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation
authored a paper 2 days ago
Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images
View all activity

Organizations

earth-insights's profile picture

upvoted 2 papers 2 days ago

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

Paper • 2605.29430 • Published 13 days ago • 1

MMAE: A Massive Multitask Audio Editing Benchmark

Paper • 2606.07229 • Published 5 days ago • 42
upvoted a paper 7 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 242
upvoted a collection about 1 year ago

Describe Anything

Collection
Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 3 days ago • 63
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs