OpenACI

Open Agent Computer Interface

A general agent based on large language models that acts on computers.

Research Publication

LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models

S Agashe, Y Fan, A Reyna, XE WangarXiv preprint arXiv:2310.03903, 2024

Comclip: Training-free compositional image and text matching

K Jiang, X He, R Xu, XE WangNAACL 2024, 2024

Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA

Y Fan, J Gu, K Zhou, Q Yan, S Jiang, CC Kuo, X Guan, XE WangACL 2024, 2024

Swapanything: Enabling arbitrary object swapping in personalized visual editing

J Gu, Y Wang, N Zhao, W Xiong, Q Liu, Z Zhang, H Zhang, J Zhang, ...ECCV 2024, 2024

Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation

Y Zhou, R Zhang, K Zheng, N Zhao, J Gu, Z Wang, XE Wang, T SunarXiv preprint arXiv:2406.09305, 2024

VIA: A Spatiotemporal Video Adaptation Framework for Global and Local Video Editing

J Gu, Y Fang, I Skorokhodov, P Wonka, X Du, S Tulyakov, XE WangarXiv preprint arXiv:2406.12831, 2024

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding

Y Fan, L Ding, CC Kuo, S Jiang, Y Zhao, X Guan, J Yang, Y Zhang, ...arXiv preprint arXiv:2406.19263, 2024

Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA

Q Yan, X He, X Yue, XE WangarXiv preprint arXiv:2405.20421, 2024

Navigation as Attackers Wish? Towards Building Byzantine-Robust Embodied Agents under Federated Learning

Y Zhang, Z Di, K Zhou, C Xie, X WangNAACL 2024, 2024

Vicor: Bridging visual understanding and commonsense reasoning with large language models

K Zhou, K Lee, T Misu, XE WangACL 2024, 2024

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

X He, W Feng, K Zheng, Y Lu, W Zhu, J Li, Y Fan, J Wang, L Li, Z Yang, ...arXiv preprint arXiv:2406.08407, 2024

FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation

X He, J Zheng, JZ Fang, R Piramuthu, M Bansal, V Ordonez, ...arXiv preprint arXiv:2405.04834, 2024

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models

W Feng, W Zhu, T Fu, V Jampani, A Akula, X He, S Basu, XE Wang, ...NeurIPS 2023, 2023

Parameter-efficient Model Adaptation for Vision Transformers

X He, C Li, P Zhang, J Yang, XE WangAAAI 2023, 2023

ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation

K Zhou, K Zheng, C Pryor, Y Shen, H Jin, L Getoor, XE WangICML 2023, 2023

Minigpt-5: Interleaved vision-and-language generation via generative vokens

K Zheng, X He, XE WangarXiv preprint arXiv:2310.02239, 2023

Multimodal procedural planning via dual text-image prompting

Y Lu, P Lu, Z Chen, W Zhu, XE Wang, WY WangarXiv preprint arXiv:2305.01795, 2023

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation

Y Lu, X Yang, X Li, XE Wang, WY WangNeurIPS 2023, 2023

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation

J Wang, XG Liu, Z Di, Y Liu, XE WangACL 2023, 2023

Photoswap: Personalized subject swapping in images

J Gu, Y Wang, N Zhao, TJ Fu, W Xiong, Q Liu, Z Zhang, H Zhang, J Zhang, ...NeurIPS 2023, 2023

Aerial Vision-and-Dialog Navigation

Y Fan, W Chen, T Jiang, C Zhou, Y Zhang, XE WangFindings of ACL 2023, 2023

Imagine: An imagination-based automatic evaluation metric for natural language generation

W Zhu, XE Wang, A Yan, M Eckstein, WY WangEACL 2023, 2023

Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation

W Zhu, X Wang, Y Lu, TJ Fu, XE Wang, M Eckstein, WY WangEMNLP 2023, 2023

Multimodal Graph Transformer for Multimodal Question Answering

X He, XE WangEACL 2023, 2023

Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners

X He, W Feng, TJ Fu, V Jampani, A Akula, P Narayana, S Basu, WY Wang, ...arXiv preprint arXiv:2305.10722, 2023

R2H: Building Multimodal Navigation Helpers that Respond to Help Requests

Y Fan, J Gu, K Zheng, XE WangEMNLP 2023, 2023

Parameter-Efficient Cross-lingual Transfer of Vision and Language Models via Translation-based Alignment

Z Zhang, J Wang, XE WangEMNLP 2023, 2023

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

W Feng, X He, TJ Fu, V Jampani, A Akula, P Narayana, S Basu, XE Wang, ...ICLR 2023, 2022

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

J Gu, E Stefani, Q Wu, J Thomason, XE WangACL 2022, 2022

Compositional temporal grounding with structured variational cross-graph correspondence learning

J Li, J Xie, L Qian, L Zhu, S Tang, F Wu, Y Yang, Y Zhuang, XE WangProceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022

VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation

K Zheng, X Chen, OC Jenkins, XE WangNeurIPS 2022, 2022

Language-driven artistic style transfer

TJ Fu, XE Wang, WY WangEuropean Conference on Computer Vision, 717-734, 2022

Diagnosing Vision-and-Language Navigation: What Really Matters

W Zhu, Y Qi, P Narayana, K Sone, S Basu, XE Wang, Q Wu, M Eckstein, ...NAACL 2022, 2022

Neuro-Symbolic Procedural Planning with Commonsense Prompting

Y Lu, W Feng, W Zhu, W Xu, XE Wang, M Eckstein, WY WangICLR 2023, 2022

Understanding Instance-Level Impact of Fairness Constraints

J Wang, XE Wang, Y LiuICML 2022, 2022

Visualize Before You Write: Imagination-Guided Open-Ended Text Generation

W Zhu, A Yan, Y Lu, W Xu, XE Wang, M Eckstein, WY WangEACL 2023, 2022

Imagination-Augmented Natural Language Understanding

Y Lu, W Zhu, XE Wang, M Eckstein, WY WangNAACL 2022, 2022

Assessing Multilingual Fairness in Pre-trained Multimodal Representations

J Wang, Y Liu, XE WangFindings of ACL 2022, 2022

Jarvis: A neuro-symbolic commonsense reasoning framework for conversational embodied agents

K Zheng, K Zhou, J Gu, Y Fan, J Wang, Z Di, X He, XE WangarXiv preprint arXiv:2208.13266, 2022

CPL: Counterfactual Prompt Learning for Vision and Language Models

X He, D Yang, W Feng, TJ Fu, A Akula, V Jampani, P Narayana, S Basu, ...EMNLP 2022, 2022

Language-based Video Editing via Multi-Modal Multi-Level Transformer

TJ Fu, XE Wang, ST Grafton, MP Eckstein, WY WangCVPR 2022, 2022

Associations between inflammatory marker profiles and neurocognitive functioning in people with schizophrenia and non-psychiatric comparison subjects

DH Adamowicz, PD Shilling, BW Palmer, TT Nguyen, E Wang, C Liu, X Tu, ...Journal of psychiatric research 149, 106-113, 2022

FedVLN: Privacy-preserving Federated Vision-and-Language Navigation

K Zhou, XE WangECCV 2022, 2022

Interpretable Research Replication Prediction via Variational Contextual Consistency Sentence Masking

T Luo, R Meng, XE Wang, Y LiuFindings of ACL 2022, 2022

Anticipating the Unseen Discrepancy for Vision and Language Navigation

Y Lu, H Zhang, P Nie, W Feng, W Xu, XE Wang, WY WangarXiv preprint arXiv:2209.04725, 2022

Estimating Instance-dependent Label-noise Transition Matrix using a Deep Neural Network

J Wang, EX Wang, Y LiuInternational Conference on Machine Learning, 2022

VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation

L Li, J Lei, Z Gan, L Yu, YC Chen, R Pillai, Y Cheng, L Zhou, XE Wang, ...NeurIPS 2021, 2021

Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search

J Wang, Y Liu, XE WangEMNLP 2021, 2021

Multimodal text style transfer for outdoor vision-and-language navigation

W Zhu, XE Wang, TJ Fu, A Yan, P Narayana, K Sone, S Basu, WY WangEACL 2021, 2021

L2C: Describing Visual Differences Needs Semantic Understanding of Individuals

A Yan, XE Wang, TJ Fu, WY WangEACL 2021, 2021

CUDA-GHR: Controllable Unsupervised Domain Adaptation for Gaze and Head Redirection

S Jindal, XE WangWACV 2023, 2021

Visual Question Rewriting for Increasing Response Rate

J Wei, X Li, Y Zhang, X WangSIGIR 2021, 2021

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

Y Qi, Q Wu, P Anderson, X Wang, WY Wang, C Shen, A HengelCVPR 2020, 2020

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

X Wang, V Jain, E Ie, WY Wang, Z Kozareva, S RaviECCV 2020, 2020

Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images

G Oxholm, X WangUS Patent 10,565,757, 2020

SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning

TJ Fu, XE Wang, S Grafton, M Eckstein, WY WangEMNLP 2020, 2020

Relational graph learning for grounded video description generation

W Zhang, XE Wang, S Tang, H Shi, H Shi, J Xiao, Y Zhuang, WY WangProceedings of the 28th ACM International Conference on Multimedia, 3807-3828, 2020

Vision-language navigation policy learning and adaptation

X Wang, Q Huang, A Celikyilmaz, J Gao, D Shen, YF Wang, WY Wang, ...IEEE transactions on pattern analysis and machine intelligence 43 (12), 4205 …, 2020

Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation

J Xiang, XE Wang, WY WangFindings of EMNLP 2020, 2020

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

W Zhu, XE Wang, P Narayana, K Sone, S Basu, WY WangEMNLP 2020, 2020

Closing the loop between language and vision for embodied agents

X WangUniversity of California, Santa Barbara, 2020

Proceedings of the First Workshop on Advances in Language and Vision Research

X Wang, J Thomason, R Hu, X Chen, P Anderson, Q Wu, A Celikyilmaz, ..., 2020

VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research

X Wang, J Wu, J Chen, L Li, YF Wang, WY WangICCV 2019, 2019

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

X Wang, Q Huang, A Celikyilmaz, J Gao, D Shen, YF Wang, WY Wang, ...CVPR 2019, 2019

MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment

D Zhang, X Dai, X Wang, YF Wang, LS DavisCVPR 2019, 2019

Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler

TJ Fu, X Wang, M Peterson, S Grafton, M Eckstein, WY WangECCV 2020, 2019

Self-Supervised Dialogue Learning

J Wu, X Wang, WY WangACL 2019, 2019

Self-Supervised Learning for Contextualized Extractive Summarization

H Wang, X Wang, W Xiong, M Yu, X Guo, S Chang, WY WangACL 2019, 2019

Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

D Shen, A Celikyilmaz, Y Zhang, L Chen, X Wang, J Gao, L CarinACL 2019, 2019

Extract and edit: An alternative to back-translation for unsupervised neural machine translation

J Wu, X Wang, WY WangNAACL 2019, 2019

Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning

X Wang, J Wu, D Zhang, Y Su, WY WangAAAI 2019, 2019

Cross-Lingual Vision-Language Navigation

A Yan, XE Wang, J Feng, L Li, WY WangarXiv preprint arXiv:1910.11301, 2019

Natural Language Grounded Multitask Navigation

X Wang, V Jain, E Ie, WY Wang, Z Kozareva, S RaviNeurIPS-ViGIL 2019, 2019

Not All Actions Are Equal: Learning to Stop in Language-Grounded Urban Navigation

J Xiang, X Wang, WY WangNeurIPS-ViGIL 2019, 2019

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

X Wang, W Xiong, H Wang, WY WangECCV 2018, 2018

No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling

X Wang, W Chen, YF Wang, WY WangACL 2018, 2018

Video Captioning via Hierarchical Reinforcement Learning

X Wang, W Chen, J Wu, YF Wang, WY WangCVPR 2018, 2018

Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning

X Wang, YF Wang, WY WangNAACL HLT 2018, 2018

S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

D Zhang, X Dai, X Wang, YF WangBMVC 2018, 2018

XL-NBT: A Cross-lingual Neural Belief Tracking Framework

W Chen, J Chen, Y Su, X Wang, D Yu, X Yan, WY WangEMNLP 2018, 2018

Enhancing the Robustness of Prior Network in Out-of-Distribution Detection

W Chen, Y Shen, X Wang, W WilliamarXiv preprint arXiv:1811.07308, 2018

Deep reinforcement learning for visual object tracking in videos

D Zhang, H Maei, X Wang, YF WangarXiv preprint arXiv:1701.08936, 2017

Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer

X Wang, G Oxholm, D Zhang, YF WangCVPR 2017, 2017

Effect of spine hardware on small spinal stereotactic radiosurgery dosimetry

X Wang, JN Yang, X Li, R Tailor, O Vassilliev, P Brown, L Rhines, ...Physics in Medicine & Biology 58 (19), 6733, 2013

CIAPIN1 siRNA inhibits proliferation, migration and promotes apoptosis of VSMCs by regulating Bcl-2 and Bax

Z Yang, W Eric Wang, Q ZhangCurrent neurovascular research 10 (1), 4-10, 2013

The use of scFv-displaying yeast in mammalian cell surface selections

XX Wang, EV ShustaJournal of immunological methods 304 (1-2), 30-42, 2005

Ready to use your computer
in a "Simular" way?

Shares and organize your memory, and personalize your tasks.

Take notes
Notifications
Give feedback
Play computer actions
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.