When trying to use glmnet, probably the most popular R package for regularized regression, to do weighted lasso estimation, I struggled a lot on `penalty.factor `

in the `glmnet()`

function. After doing a series of experiments, I finally understood how it would work and how it affects the lambda sequence. Hope this article can help clarify your confusion as well.

Firstly, let me introduce some background knowledge. The main goal of penalty.factor is to **allow different shrinkage on different betas** and hence it can be used to perform **weighted lasso**. And a lambda sequence is used to get the solution path…

D-separation is a critical idea in Bayesian Networks and causal inference. The problem it intends to tackle is: **given a causal graph G, is a set X of variables independent of another set Y, given a third set Z?**

At the first sight, it may look intimidating. However, with a few examples, you shall able to understand it better. For this article, I mainly focus on the application aspect and will talk about:

**three rules**to check d-separation in the corresponding scenarios;**one step-by-step algorithm**to check d-separation in general;- how to use R package
**bnlearn**to check d-separation.

If…

The original paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.

This paper is more engineering-oriented than method-oriented in my opinion. It doesn’t propose new model architectures or training techniques. Yet the contribution of this paper is tremendous. With the goal of investigating the exact contribution of various architectures, training objectives, techniques, and training datasets on transfer learning in NLP, the authors perform a series of systematic experiments and show us the optimal and promising strategies to consider empirically. Afterwards, they combine all of their findings and propose a pre-trained model T5 and the dataset C4. …

First, note that scatter_() is an inplace function, meaning that it will change the value of input tensor.

The official document `scatter_`

(*dim*, *index*, *src*) → *Tensor* tells us that parameters include the dim, index tensor, and the source tensor. *dim *specifies where the index tensor is functioning, and we will keep the other dimensions unchanged. And as the function name suggests, the goal is to scatter values in the source tensor to the input tensor *self*. …

A Ph.D. student in Statistics and NLP.