Papers
arxiv:2604.16932

Neighbor Embedding for High-Dimensional Sparse Poisson Data

Published on Apr 18
Authors:
,

Abstract

Poisson Stochastic Neighbor Embedding (p-SNE) is a nonlinear dimensionality reduction method designed for count data that preserves the discrete, sparse nature of low-rate Poisson-distributed measurements.

AI-generated summary

Across many scientific fields, measurements often represent the number of times an event occurs. For example, a document can be represented by word occurrence counts, neural activity by spike counts per time window, or online communication by daily email counts. These measurements yield high-dimensional count data that often approximate a Poisson distribution, frequently with low rates that produce substantial sparsity and complicate downstream analysis. A useful approach is to embed the data into a low-dimensional space that preserves meaningful structure, commonly termed dimensionality reduction. Yet existing dimensionality reduction methods, including both linear (e.g., PCA) and nonlinear approaches (e.g., t-SNE), often assume continuous Euclidean geometry, thereby misaligning with the discrete, sparse nature of low-rate count data. Here, we propose p-SNE (Poisson Stochastic Neighbor Embedding), a nonlinear neighbor embedding method designed around the Poisson structure of count data, using KL divergence between Poisson distributions to measure pairwise dissimilarity and Hellinger distance to optimize the embedding. We test p-SNE on synthetic Poisson data and demonstrate its ability to recover meaningful structure in real-world count datasets, including weekday patterns in email communication, research area clusters in OpenReview papers, and temporal drift and stimulus gradients in neural spike recordings.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.16932
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.16932 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.16932 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.16932 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.