handmade.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
handmade.social is for all handmade artisans to create accounts for their Etsy and other handmade business shops.

Server stats:

35
active users

#papers

0 posts0 participants0 posts today
Continued thread

Rarely do private companies find themselves punished for using #shell #companies to move capital and avoid taxes.

The fine accountants at Ernst & Young cooked up a complicated scheme in 2008 for a restructuring of #Koch #Industries via shell entities in Luxembourg,
a notorious tax haven, with the reasonable expectation that the ruse would never be revealed.

But then someone leaked a raft of private documents from #Mossack #Fonseca, a law firm in Panama that specializes in the creation of shell companies.

The info dump became known as the #Panama #Papers, and among its many revelations was Koch Industries’ bid to reinvent parts of the company, on paper,
as tax-avoidant Luxembourg shell companies.

According to the Center for Public Integrity, the essence of the Koch Industries deal was to
“reorder the ownership of many subsidiaries and centralize them under Luxembourg companies that are all served by internal corporate finance companies,
akin to a company’s own bank.”

Maybe that’s where the Koch siblings got the idea to get behind #DonorsTrust
—a sort of house bank for the array of political entities and think tanks they fund.

Of course, as with all of the organizations funded by Koch, they’re not in it alone.

#Betsy and #Dick #DeVos helped fund DonorsTrust, according to Mother Jones.

And then there are the many Koch-network “pass-through” groups, such as "Freedom Partners"
and the "Center to Protect Patient Rights",
which function much the way that shell companies do in the world of private capital:
-- they add layers of obfuscation over the provenance of the dollars flowing from one right-wing organization or institution to the next.

For instance, there’s the #Wellspring #Committee,
a pass-through funded in part via the Koch network, whose director, Ann Corkery, also sat for six years on the board of the #Becket #Fund for #Religious #Liberty,
a pro-bono law firm, according to tax filings.

With its portfolio of so-called religious freedom cases,
the Becket Fund gained notice as the firm representing the principals of the #Hobby #Lobby company in
a 2014 Supreme Court challenge to a mandate in the Affordable Care Act
for employer-based health insurance to cover,
without a co-pay, the costs of prescription #contraception.

One type of private company is the “#closely #held” variety, which may occasionally trade stock publicly, but has only a few shareholders.

The Supreme Court’s decision in favor of Hobby Lobby (number 106 on the 2016 Forbes list of the nation’s top private companies) specifically cited its “closely held” status as a qualification for its exemption from the ACA contraceptive mandate.

Continued thread

8/n

REFERENCES

[1] Iman Mirzadeh, keivan alizadeh vahid, Hooman Shahrokhi, Oncel Tuzel, Samy Bengio, and Mehrdad Farajtabar. 2024. GSM-symbolic: Understanding the limitations of mathematical reasoning in Large Language Models. arxiv.org/abs/2410.05229

[2] Keyon Vafa, Justin Y. Chen, Jon Kleinberg, Sendhil Mullainathan, and Ashesh Rambachan. 2024. Evaluating the world model implicit in a generative model. arxiv.org/abs/2406.03689

arXiv.orgGSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language ModelsRecent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities have genuinely advanced, raising questions about the reliability of the reported metrics. To address these concerns, we conduct a large-scale study on several SOTA open and closed models. To overcome the limitations of existing evaluations, we introduce GSM-Symbolic, an improved benchmark created from symbolic templates that allow for the generation of a diverse set of questions. GSM-Symbolic enables more controllable evaluations, providing key insights and more reliable metrics for measuring the reasoning capabilities of models.Our findings reveal that LLMs exhibit noticeable variance when responding to different instantiations of the same question. Specifically, the performance of all models declines when only the numerical values in the question are altered in the GSM-Symbolic benchmark. Furthermore, we investigate the fragility of mathematical reasoning in these models and show that their performance significantly deteriorates as the number of clauses in a question increases. We hypothesize that this decline is because current LLMs cannot perform genuine logical reasoning; they replicate reasoning steps from their training data. Adding a single clause that seems relevant to the question causes significant performance drops (up to 65%) across all state-of-the-art models, even though the clause doesn't contribute to the reasoning chain needed for the final answer. Overall, our work offers a more nuanced understanding of LLMs' capabilities and limitations in mathematical reasoning.
Continued thread

7/

There are many interesting questions to be explored from here: are these new metrics sufficient to capture models’ ability to learn world models? What about the problems that can’t be reduced to DFA? How can we map next-token prediction to learning the deeper and more holistic representation of the world, or should we find new ways?

Continued thread

5/

The authors then tested two instances of DFA: navigating streets in New York City and playing the board game Othello. They found big gaps both between the conventional metrics (next-token test and current state probe) and the new metrics (pictures), and model performance (measured by the new metrics) when the “world” changes, e.g., randomly increasing cost on certain roads in the city (“noisy shortest paths” in picture 1).

Continued thread

4/

The compression precision measures how well a model can conclude that the same state, no matter what input sequences are used to reach it, should lead to the same accepted sequences. The distinction precision, on the other hand, measures how well a model can recognize input sequences that lead to different states.

Continued thread

3/

Another interesting work to test if transformers really build world models is by reducing problems into Deterministic Finite Automata (DFA) — graphs depicting possible states and transitions among them when consuming input — and see if these models really learn them [2]. The authors introduce two new metrics: compression precision and distinction precision (picture).

Continued thread

2/

One recent work from colleagues in Apple gives negative results on math reasoning [1]. They show on GSM8K benchmark — a test set on grade-school-level math questions — SOTA transformer models perform significantly worse when they simply change the names, numbers, or both in the problems (picture). This clearly demonstrates these models don’t really learn the math principles behind the problems.

Hi! I'm a bot that shares research papers in EvoDevo (or Evolutionary Developmental Biology).

Currently, I index a few journals in the field and post links to articles that were published recently, one per day.

Since my posts are set to “quiet public”, they don't appear on Mastodon's feeds or hashtags searches. So, please, boost the papers you find worth sharing!

Any feedback is welcome. Thanks :)

Hello Lawprofs! As you probably know, hashtags are important for discovering content on Mastodon, which doesn't have full-text search by default. Here are a few you might find useful:

#papers - plug your new paper/book or flag someone else's

#conferences - announce upcoming conferences or workshops

#cfp - calls for papers

#hiring - announce or flag job openings

#teaching - discussions of classroom technique or materials

#lawreviews - discussions of the submission process and related topics

Hello you fine #neighbours, I wish you good things!

#reintroduction after a few few years:

- will fav any #cat I come across, except from bots
- I want to see & read (your?) #visual #art #stories #papers #zines, please @ me if you want appreciation <3
- Your efforts to educate this #cis person are taken to heart
- I am a moderator in the #jungletrain #irc channel, DM me if there is an issue
- recently I have acquired #anarchist #literature, let's see how that goes :)

#Consent is important!