Introducing VLOGGER: a leap into the future of audio-driven human video synthesis!

Google Research has just unveiled VLOGGER, a revolutionary framework that brings single images to life with dynamic videos created from audio inputs.

Imagine generating lifelike videos of people talking, complete with head motion, gaze, lip movement, and even upper-body gestures, all from a single photo and a piece of audio!

No need for person-specific training, face detection, or cropping. Google Research’s approach generates the entire image, capturing a broad spectrum of human communication scenarios.

Unparalleled photorealism and temporal coherence.

Generation of full-body dynamics, including gestures.

The applications? Endless! From video editing to personalization, VLOGGER could transform content creation, online communication, education, and more. ✨

Here’s the research for a deeper dive. And here’s the paper.

Here are some additional examples:

Springer Nature recently announced a new AI-powered in-house writing assistant to support researchers. The tool has been trained on scholarly literature that covers 447+ disciplines, encompasses over 2,000 specialized subjects, and incorporates feedback from more than 1 million revisions on papers—including those featured in esteemed Nature publications.

Studies indicate scientists who are non-native English speakers spend 51% more time writing papers on average. This disparity creates an imbalance in the research field, hindering the progression of knowledge and affecting the contribution of top-tier research from various parts of the world.

This underscores the trend towards creating LLMs tailored for particular uses with specialized domain knowledge.

 

 

Vision-language models (VLMs) are AI models that combine both vision and language modalities. These models can process both images and natural language. Researchers are expanding VLMs by including an action layer. These models can process visual and textual information and generate sequences of decisions for real-world scenarios. This fusion of vision, language, and action within computational models is emerging as a potentially useful AI paradigm for a wide range of applications. Vision-Language-Action Models (VLAs) are designed to perceive visual data, interpret it using linguistic context, and subsequently generate a corresponding action or response. In essence, VLAs emulate human-like cognition, where sight, comprehension, and action intertwine.

At its core, VLAs marries computer vision with natural language processing. The vision component enables machines to “see” or interpret visual data. This is complemented by the language component which processes this visual information in linguistic terms, enabling the machine to “understand” or describe what it sees. Finally, the action component facilitates a response, whether that be a decision, movement, or another specific output.

Wayve recently introduced LINGO-1, an open-loop driving commentator. Some key quotes from their announcement:

The use of natural language in training robots is still in its infancy, particularly in autonomous driving. Incorporating language along with vision and action may have an enormous impact as a new modality to enhance how we interpret, explain and train our foundation driving models. By foundation driving models, we mean models that can perform several driving tasks, including perception (perceiving the world around them), causal and counterfactual reasoning (making sense of what they see), and planning (determining the appropriate sequence of actions). We can use language to explain the causal factors in the driving scene, which may enable faster training and generalisation to new environments.

We can also use language to probe models with questions about the driving scene to more intuitively understand what it comprehends. This capability can provide insights that could help us improve our driving models’ reasoning and decision-making capabilities. Equally exciting, VLAMs open up the possibility of interacting with driving models through dialogue, where users can ask autonomous vehicles what they are doing and why. This could significantly impact the public’s perception of this technology, building confidence and trust in its capabilities.

In addition to having a foundation driving model with broad capabilities, it is also eminently desirable for it to efficiently learn new tasks and quickly adapt to new domains and scenarios where we have small training samples. Here is where natural language could add value in supporting faster learning. For instance, we can imagine a scenario where a corrective driving action is accompanied by a natural language description of incorrect and correct behaviour in this situation. This extra supervision can enhance few-shot adaptations of the foundation model. With these ideas in mind, our Science team is exploring using natural language to build foundation models for end-to-end autonomous driving.

These models enable us to ask questions so we can better understand what the model “sees” and to better understand its reasoning.  Here’s an example:

https://www.youtube.com/watch?v=6X51pxPJpa4&list=PL5ksjZd5b6SK5X_u1Ix-flUjNS97_fk4r&index=9

Language can help interpret and explain AI model decisions, a potentially useful application when it comes to adding transparency and understanding to AI. It can also help train models, enabling them to more quickly adapt to changes in the real-world.

The question, “Will AI take my job?” is becoming ubiquitous in the modern workplace. As AI continues to evolve, it’s natural for all of us to feel a mix of excitement and trepidation. Here are ten insights and actions we can take to gauge and navigate the AI revolution:

  1. AI Doesn’t Just Replace, It Augments:
    • Insight: AI doesn’t aim to replace human roles. More often, it augments them, making tasks easier and more efficient.
    • Action: Embrace AI tools in your current role. By integrating them into your daily tasks, you can enhance your productivity and showcase your adaptability.
  2. Soft Skills Matter More Than Ever:
    • Insight: While AI excels in processing and pattern recognition, it still struggles with empathy, creativity, and interpersonal communication.
    • Action: Invest in developing soft skills. Attend workshops, read books, or take online courses to hone your emotional intelligence, leadership, and creativity.
  3. Routine is AI’s Playground:
    • Insight: Jobs that involve repetitive, routine tasks are more susceptible to automation.
    • Action: Diversify your skill set. If your role involves routine tasks, seek opportunities to take on more strategic, varied responsibilities.
  4. AI Struggles with Ambiguity:
    • Insight: AI requires clear instructions and defined parameters. Ambiguous tasks that require nuanced judgment remain a human domain.
    • Action: Position yourself in roles that demand decision-making in ambiguous situations. This could involve strategy, planning, or crisis management.
  5. Continuous Learning is the New Normal:
    • Insight: The AI landscape is ever-evolving. What’s cutting-edge today might be obsolete tomorrow.
    • Action: Adopt a mindset of continuous learning. Regularly update your knowledge about the latest AI trends and their implications for your industry.
  6. Interdisciplinary Knowledge is a Shield:
    • Insight: AI finds it challenging to replicate interdisciplinary expertise, where knowledge from multiple domains is applied.
    • Action: Don’t pigeonhole yourself. Gain expertise in complementary fields to make your skill set unique and invaluable.
  7. AI is Only as Good as Its Data:
    • Insight: AI relies on vast amounts of data. Roles that involve data cleaning, interpretation, and ethical considerations are crucial.
    • Action: Understand the data that drives AI in your industry. Consider roles in data analytics, interpretation, or ethics.
  8. Ethical Considerations are Paramount:
    • Insight: As AI becomes more integrated into our lives, ethical dilemmas will arise. Human judgment will be essential in navigating these challenges.
    • Action: Engage in discussions about the ethical implications of AI in your field. This will position you as a thoughtful leader in the AI conversation.
  9. AI Cannot Replicate Human Networks:
    • Insight: While AI can analyze social networks, it cannot replicate the depth of human relationships and the nuances of networking.
    • Action: Build and maintain a strong professional network. Relationships will always be a cornerstone of business, irrespective of AI’s growth.
  10. AI is a Tool, Not a Replacement:
    • Insight: At its core, AI is a tool designed to aid human endeavors, not replace them entirely.
    • Action: Stay informed about how AI can be a tool in your toolkit, rather than a threat. Collaborate with AI, rather than competing against it.

The rise of AI undoubtedly brings challenges, but it also offers opportunities. By understanding the nuances of AI’s capabilities and limitations, and by proactively adapting, workers can not only ensure their relevance but also thrive in an AI-augmented workplace. The future isn’t about humans vs. machines; it’s about humans and machines working in tandem to achieve unprecedented outcomes.

The Ripple Effect of AI touches Everyone

Every AI forecast is wrong because all of these forecasts implictly miss how technology transforms a society.

There’s a misconception bouncing around that AI’s impact will be limited to those who directly use or interact with it. While people might not be saying this explicitly, the misconception is implicit in the AI forecasts flooding the news. The reality is far more encompassing. AI is not just about automating tasks or improving efficiency; it, like other transformative technologies, is about fundamentally changing the way our world operates.

The most recent addition comes from JP Morgan analyst Brian Nowak, who estimates AI will impact 44% of the labor force over the next few years. While the number seems impressive on the surface, it underestimates the broader implications. Figures like these only capture the direct impact.

Consider how the automobile changed society. Vehicle ownership crossed 50% in 1948, but even before that, vehicle ownership was impacting society, influencing which towns thrived and which ones died. Vehicle ownership changed where houses were being built and the type that were being constructed. It ushered in fast food restaurants and motel franchises. In short, cars changed how society operated and these changes were felt by everyone, not just those behind the wheel.

Consider how the smartphone has impacted society more recently. Impacting everyone, not just those who carried it around in their pocket. Today those initial changes are hard to see because 85% of us have a smartphone, but long before we got to this point, smartphones were already having a transformative effect on us all.

Even if you’re not using AI-powered tools or services, or you think your job is “secure,” the secondary and tertiary effects of AI will touch every aspect of our lives. From the way goods are produced and distributed to the way we communicate and make decisions, AI’s influence will be pervasive. The ripple effect will be felt by everyone, directly and indirectly.

In a world intertwined with AI, we need to understand that the future is interconnected, and AI will be a significant thread weaving through it all.

#AI #FutureOfWork #DigitalTransformation

Last week took me to Nashville, though sadly not for Taylor Swift. Instead, I was privileged to join the Unleash HR Summit. I have been fortunate to cross paths with Dirk Beveridge over the years and witness his remarkable efforts in transforming legacy distributors into dynamic and innovative market leaders. It’s great to see him extend his expertise to HR executives, because the radical reinvention of people and culture is central to any organization’s transformation. The conference was truly exceptional, and I was delighted to be part of such an incredible community.

My keynote centered on the intersection of HR and technology. Here are a few of the thoughts I shared:

  1. See the lasting impact technology, and the shift from digitization to ‘data’fication, is having on HR

Technology has revolutionized nearly every aspect of modern life, and the field of human resources is no exception. To stay ahead of the curve, HR executives must keep up with the latest advancements. New tools and software can provide valuable insights into workforce analytics, increase employee engagement, and improve talent management. HR executives must understand how these technologies impact the future of work and their teams on a personal level. It’s important to remember that technology often has unforeseen second-order effects, which can be much larger and more impactful. HR executives need to spend time understanding the second-order effects of tomorrow’s technologies.

  1. Embrace the Changing Nature of Artificial Intelligence (AI)

The future of work is decidedly human and AI is unlikely to take your job anytime soon. However, a burst of new AI applications does suggest those who can fully leverage its potential will have the most significant impact on their organizations. AI is disrupting the HR role in specific and special ways. Ultimately, it is also elevating the HR role to a more strategic level than it has sometimes played in the past. HR executives should aggressively experiment with new AI tools to find the ones that best support their unique mission and culture. These might be chatbots to answer employee questions and provide support 24/7 or perhaps predictive analytics to identify employees who are at risk of leaving so HR executives can take proactive measures to retain them.

  1. Understand how demographics shifts are changing HR

As Gen Z enters the workforce and becomes a larger consumer base, their tastes preferences are exerting a greater influence on workplaces. This generation values quick access to information and the ability to independently solve problems. They expect services to be available “on demand” and facilitated by technology. HR technology tools need to change with changing demographics. At the same time, most organizations now have employees in their workforce from more generations than ever before and HR executives will need to balance different needs. Technology can be utilized to improve the employee experience. HR executives can use tools such as employee self-service portals to provide convenient access to information. You might also use virtual and augmented reality to provide immersive training and enhance onboarding experiences. These tools can also be used to help scale the expertise of some of your most seasoned employees.

  1. Employees need more from their companies

Current data on employee sentiment is troubling.

  • Workers are broadly dissatisfied with their company when it come to their work.
  • Half of the workers report that they do not understand what is expected of them at work. I am concerned hybrid work environments might exacerbate this problem because they often eliminate the small clarifying conversations that occur serendipitously throughout a workday.
  • Only approximately one-third of workers feel that their company’s mission and purpose make their job feel significant, and similarly, only about a third of workers feel they have the chance to utilize their unique strengths every day to do what they do best.

Employees need more from their employers. They need to feel they are making a difference and be recognized for their contribution in meaningful ways. HR executives will play a central role in delivering these needs in the future, and technology will also play an important role.

  1. Growth requires new processes

James Clear’s assertion is that “you do not rise to the level of your goals, you fall to the level of your systems.” This means that having lofty goals alone is not sufficient; we must also establish effective systems to reach them. It is not enough to apply new technologies to old processes. Modern technology demands new processes and procedures to harness its complete potential. HR executives should prioritize agility by fostering a culture of experimentation and innovation. They should inspire employees to contribute novel ideas and offer opportunities for learning and growth.

  1. Executives need to think differently about the future

HR executives must think differently about the future to remain competitive and relevant. The time to act is now. HR executives, in particular, must anticipate future trends and adapt their strategies to attract and retain top talent.

Lately I’ve been thinking a lot about where AI fits into corporate strategy and structure. Marc Andreessen recently suggested

…there’s two obvious business models… one is to be a horizontal platform provider [or] infrastructure provider, analogous to the operating system or the database for the cloud. The other opportunity is in the verticals, the applications of AI.

…AI is a platform and an architecture, in the same sense that the mainframe was architecture, the mini computer was an architecture, the PC, the internet, the cloud, have been architectures. We think there are very good odds that AI is the next one of those. When there’s an architecture shift in our business…everything above the architecture gets rebuilt from scratch. Because the fundamental assumptions about what you’re building change. You’re no longer building a website, you’re no longer building a mobile app, you’re no longer building any of those things, you’re building instead an AI engine that is, in the ideal case, giving you the answer to whatever the question is. And if that’s the case, then basically all applications will change. Along with that all infrastructure will change. Basically, the entire industry will turn over again, the same way that it did with the internet, and the same way it did with mobile and cloud. And so if that’s the case, then it’s just it’s going to be like an absolute explosive period of growth for this entire industry.

What does this mean for business? It means incumbents are displaced because their products are no longer relevant. And the products become irrelevant because the service that was being performed is now being delivered in a new way, perhaps by an AI engine running across the organization. Or perhaps the service is no longer being delivered at all because it has become redundant in a world wherein AI engines deliver superior services to what was previously being delivered.

If AI is a feature sitting on top of an already existing underlying service, then AI will eventually be added to most products and services. But it will sit on top of the service and the underlying service won’t actually change much.

I see AI as much more transformational. I see AI reframing the underlying service. In this way, AI creates new services. Entirely different services and experiences than what was previously being offered.

The Bureau of Industry and Security (BIS) issued an interim final rule that imposes a license requirement for the export and reexport of software specially designed to automate the analysis of geospatial imagery. The rule applies to exports and reexports to all countries beyond Canada.

The rule is scheduled to be published on Monday January 6, 2020. But you can read the unpublished rule here.

Breast cancer is the most common cancer in women worldwide. Early detection and treatment can lower mortality rates. But clinicians still fail to identify breast cancer about 20 percent of the time (false-negative results). Clinicians also identify cancer, when there is no breast cancer present (false-positive results). Studies suggest 7-12 percent of women will receive false positive results after one mammogram and after 10 years of annual screening, more than half of women will receive at least one false-positive recall.

False-negative results provide a false sense of security and could ultimately hinder treatment effectiveness. False-positive results can cause anxiety and lead to unnecessary tests and procedures. Another hurdle in identifying breast cancer is a shortage of radiologists needed to read mammograms.

Researchers have developed an AI system that surpasses human experts in breast cancer identification. Their study results were recently published in the journal Nature.

We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers…We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%.

The study results are promising. The AI system outperformed six radiologists and also lowered missed cancer diagnoses on the U.S. sample by 9 percent and mistaken readings of breast cancer by 6 percent. It also produced results across populations, something many AI systems have yet to produce. The researchers didn’t go as far as to suggest their AI system would replace humans.

The optimal use of the AI system within clinical workflows remains to be determined. The specificity advantage exhibited by the system suggests that it could help to reduce recall rates and unnecessary biopsies. The improvement in sensitivity exhibited in the US data shows that the AI system may be capable of detecting cancers earlier than the standard of care. An analysis of the localization performance of the AI system suggests it holds early promise for flagging suspicious regions for review by experts.

Beyond improving reader performance, the technology described here may have a number of other clinical applications. Through simulation, we suggest how the system could obviate the need for double reading in 88% of UK screening cases, while maintaining a similar level of accuracy to the standard protocol. We also explore how high-confidence operating points can be used to triage high-risk cases and dismiss low-risk cases. These analyses highlight the potential of this technology to deliver screening results in a sustainable manner despite workforce shortages in countries such as the UK.

At the same time, it becomes more difficult to make the case for approaches that are exclusively human. It is hard to imagine patients, insurance companies, and others won’t demand AI systems augment what humans are doing. This is especially true in healthcare. But will also likely become increasingly true in other domains. What tasks would you want humans to do alone if you know that you can get better results (greater accuracy, faster, etc) when human capability is augmented with AI systems.

Humans will need to learn how to incorporate these type of AI systems into their workflow. The next big step for AI seems to be “operationalizing AI.” This is likely a decade in the works, but slowly you will see individuals figuring how to best work within environments that are being redefined by AI systems.