4 Comments
Apr 8Liked by Portsea Capital

I don't buy the data moats in AI as much as everyone else. Not all data is the same, the quality of data matters a lot - ex: textbooks vs twitter vs facebook text. You can learn a lot more "useful" things from textbooks vs random text on facebook. So, I don't think the data in TikTok or facebook is very useful.

Curating data has been a big thing (Ex: TextBooks are all you need https://arxiv.org/abs/2306.11644) and so is synthetic data Ex: train on a video game to learn physics, or train on LLM outputs in clever ways (Ex: Q*). Don't think facebook/instagram/tiktok/nytimes data is anywhere as useful as people claim to be.

Expand full comment
author

That’s a fair point and makes sense. I still think having proprietary data is valuable, otherwise Google wouldn’t have paid Reddit for its content. What do you think are more defendable moats in LLMs?

Expand full comment
Apr 9Liked by Portsea Capital

Reddit i think is actually useful information for a lot of user queries. Same with stack overflow. Both are not large in token size, but more useful in content.

I think the moat at this point is in execution - consistently delivering useful improvements over time - like open ai has. But it's not a static moat since we've seen competitors like claude catch up so quickly. NVDA seems to be the only one with any real moat in AI.

Expand full comment

Hey, I just took over your friend’s phone. I can see from your recent texts that you have a problem. How can I help? I think this is the beginning of a beautiful friendship.

Expand full comment