Statistics & Machine Learning (SMILE) Labortary aims to design novel & explainable machine learning mechanism from the perspective of data, and improve the robustness of real-world AI application efficiently, i.e., Data -> Model. Besides, we focus on using explainable AI to interpret & struct the large-scale and complex data in the world, and help users in their own field such as scientific discovery and decision making, i.e., Model -> Data. This involves techniques including:
- Data-Centric AI: Re-examine the scaling laws of AI model and improve training efficiency from the perspective of Data, such as data mining & sense-making, data clustering, knowledge graph, data compression.
- De-Centralized AI: Investigate Asynchronous training, and consider the combination of network module and outputs (instead of the synchronization of all individual parameters), as well as the collobration of networks trained on different data/tasks.
- Representation Learning for Multi-Modal Life-Long AI Agent: Design universal visual representations to minimize the impact of distribution gap between real-world and training scenarios, and help with life-long AI adaptation and speeding up AI product design, which can then be applied to scenarios include multi-modal, generative AI, sense-making, and embodied AI.
- AI for Science: Explore the AI application in other scientific fields such as biomedical, material science, and physics; Build AI agent to assist with scientific research such as effective data indexing and experimental simulations.
- Data Privacy & Security: Protect user data privacy and build trustworthy AI models, detect plagiarism in AI training to help protect the copyright of creators.
(The website hosted on CityU platform is being setup and will be updated soon.)