Sherrie Walton's picture

29 3

Sherrie Walton

sherriew

AI & ML interests

Voice Assistants (e.g. Siri and Alexa)

Organizations

None yet

sherriew's activity

upvoted 29 papers 2 months ago

AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents

Paper • 2407.17490 • Published Jul 3 • 30

BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation

Paper • 2407.17952 • Published Jul 25 • 27

SHIC: Shape-Image Correspondences with no Keypoint Supervision

Paper • 2407.18907 • Published Jul 26 • 39

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26 • 30

Floating No More: Object-Ground Reconstruction from a Single Image

Paper • 2407.18914 • Published Jul 26 • 18

VSSD: Vision Mamba with Non-Casual State Space Duality

Paper • 2407.18559 • Published Jul 26 • 16

Lessons from Learning to Spin "Pens"

Paper • 2407.18902 • Published Jul 26 • 19

Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings

Paper • 2407.20581 • Published Jul 30 • 23

Meltemi: The first open Large Language Model for Greek

Paper • 2407.20743 • Published Jul 30 • 67

Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation

Paper • 2407.20445 • Published Jul 29 • 20

A Large Encoder-Decoder Family of Foundation Models For Chemical Language

Paper • 2407.20267 • Published Jul 24 • 31

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27 • 56

MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains

Paper • 2407.18961 • Published Jul 18 • 38

WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds

Paper • 2407.18946 • Published Jul 11 • 12

TAPTRv2: Attention-based Position Update Improves Tracking Any Point

Paper • 2407.16291 • Published Jul 23 • 10

CARFF: Conditional Auto-encoded Radiance Field for 3D Scene Forecasting

Paper • 2401.18075 • Published Jan 31 • 8

Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion

Paper • 2401.17583 • Published Jan 31 • 25

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

Paper • 2401.18059 • Published Jan 31 • 34

TextCraftor: Your Text Encoder Can be Image Quality Controller

Paper • 2403.18978 • Published Mar 27 • 13

Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation

Paper • 2403.19319 • Published Mar 28 • 11

sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28 • 38

Vript: A Video Is Worth Thousands of Words

Paper • 2406.06040 • Published Jun 10 • 22

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Paper • 2406.02523 • Published Jun 4 • 9

Towards a Personal Health Large Language Model

Paper • 2406.06474 • Published Jun 10 • 17

Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle

Paper • 2407.13833 • Published Jul 18 • 11

Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition

Paper • 2407.13559 • Published Jul 18 • 12

The Vision of Autonomic Computing: Can LLMs Make It a Reality?

Paper • 2407.14402 • Published Jul 19 • 13

Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 22

Internal Consistency and Self-Feedback in Large Language Models: A Survey

Paper • 2407.14507 • Published Jul 19 • 44