Test Run - Using Multimodal Vision AI In Test Automation
Explores using multimodal vision AI models like LLaVA for advanced UI/UX test automation, moving beyond traditional methods.
Explores using multimodal vision AI models like LLaVA for advanced UI/UX test automation, moving beyond traditional methods.
A developer experiments with Llamafile and LLaVA 1.5 to extract structured data from comedy show posters, testing its accuracy and JSON output capabilities.
Building an image search system using GPT-4 Vision and Azure AI to find images via text queries or similar pictures.
A technical guide on using Meta AI's Segment Anything model to perform object segmentation on satellite imagery from Maxar.
A weekly tech learning digest covering Microsoft Fabric, AI topics, computer vision, Azure AI Document Intelligence, embeddings, and vector search.
Interview with Frank Liu on vector databases, embeddings, his career in ML/hardware, and work culture differences between China and the US.
Explains the Supercells algorithm for generating superpixels to improve segmentation of geospatial and satellite imagery.
Explores a future AI-assisted computer interface model inspired by sci-fi, where AI highlights data anomalies for human specialist review.
A guide to using Hugging Face Transformers library with examples for fine-tuning models like BERT and BART for NLP and computer vision tasks.
A review of the top 10 most influential machine learning papers from 2022, including ConvNeXt and MaxViT, with technical analysis.
A developer uses Python, OpenCV, and computer vision to automate collecting in-game currency in City Island 5, earning millions overnight.
A tutorial on fine-tuning Microsoft's LayoutLM model for document understanding and information extraction using the Hugging Face Transformers library.
A technical guide on using Hugging Face's SegFormer model with Amazon SageMaker for semantic image segmentation tasks.
A $10,000 charity bet on whether fully autonomous (Level 5) self-driving cars will be commercially available in major US cities by 2030.
A tutorial on creating an interactive digital frame with head-tracking perspective effects using Three.js and TensorFlow.js.
A comprehensive deep learning course covering fundamentals, neural networks, computer vision, and generative models using PyTorch.
A comprehensive deep learning course overview with PyTorch tutorials, covering fundamentals, neural networks, and advanced topics like CNNs and GANs.
Explores how images are discretized into pixels, the impact of sampling grids on deep learning models, and inconsistencies in image processing libraries.
A developer uses Python and OpenCV to build a program that identifies embroidery thread colors from images, applying computer vision techniques.