Understanding Git Dotfiles: The Real Purpose and Best Practices of .gitkeep, .gitignore, and .gitattributes

🌏 閱讀中文版本


Table of Contents

Introduction: Have You Encountered These Issues?

Have you ever experienced:

  • Empty directories won’t commit – wondering why your logs/ folder disappeared from Git?
  • CRLF vs LF conflicts – seeing “file changed” in diff even though you changed nothing?
  • Ineffective .gitignore – adding rules but files still getting tracked?
  • CI/CD failures – builds work locally but fail on GitHub Actions?
  • Uncertainty about commits – not sure whether to commit .github/, .gitkeep, or .gitattributes?

These issues stem from misunderstanding the boundaries of Git’s file system.

Key Insight: Git dotfiles fall into two categories: “officially supported” and “community conventions.” Confusing the two is the primary source of cross-team collaboration issues.

In nearly every development project, you’ll find files and directories starting with .git. According to the Git Official Documentation, some are official Git components (like .gitignore and .gitattributes), some are platform extensions (like .github/ and .gitlab-ci.yml), and others are community conventions (like .gitkeep).

This article systematically covers official vs unofficial Git files from a practical perspective, providing best practices for team collaboration, DevOps integration, and cross-platform development.


1. Overview of Official Git Files (Native Git Support)

These files or directories are part of Git’s native design, with syntax, purpose, and behavior officially supported by Git.

File Name Official Status Purpose Common Examples
.git/ ✅ Core Stores the entire version control database (objects, index, branches, config) .git/config, .git/refs/, .git/objects/
.gitignore ✅ Official Defines which files or directories should not be tracked by Git node_modules/, .env, *.log
.gitattributes ✅ Official Sets text normalization, merge strategies, LFS management *.sh text eol=lf, *.zip binary
.gitmodules ✅ Official Records submodule sources and paths [submodule "lib"]
.gitconfig ✅ Official Stores user-level or system-level configuration [user] name=...
.mailmap ✅ Official Unifies author names and emails (for contribution statistics) Merges different emails to same author
.git-blame-ignore-revs ✅ Semi-official Specifies commits to ignore in git blame Ignore formatting commits

2. Complete Git File Hierarchy Architecture

The following Mermaid diagram illustrates the complete layered structure of the Git file system:

graph TB
    subgraph "Git Ecosystem"
        A[Git Core Layer]
        B[Version Control Layer]
        C[Platform Extension Layer]
        D[Community Convention Layer]
        E[Developer Private Layer]
    end

    A --> A1[.git/ directory]
    A --> A2[.gitconfig]

    B --> B1[.gitignore]
    B --> B2[.gitattributes]
    B --> B3[.gitmodules]
    B --> B4[.mailmap]

    C --> C1[.github/]
    C --> C2[.gitlab-ci.yml]
    C --> C3[.gitreview]

    D --> D1[.gitkeep]
    D --> D2[.placeholder]

    E --> E1[.git/info/exclude]
    E --> E2[.gitignore_global]

    style A fill:#667eea,color:#fff
    style B fill:#764ba2,color:#fff
    style C fill:#f7f7fd,stroke:#667eea
    style D fill:#e8e8f5,stroke:#764ba2
    style E fill:#d4d4f0,stroke:#666

Key Concept: Five-Layer Separation Principle

Layer Who Manages Commit? Examples
Core Layer Git itself ❌ Never .git/
Version Control Layer Team shared ✅ Must .gitignore, .gitattributes
Platform Extension Layer Platform-specific ✅ Platform-dependent .github/, .gitlab-ci.yml
Community Convention Layer Unofficial agreements ⚙️ Situational .gitkeep
Developer Private Layer Individual environment ❌ Don’t .git/info/exclude

3. Practical Scenario 1: Solving “Why Doesn’t .gitignore Work?”

The Problem

$ echo "secret.key" >> .gitignore

$ git add .gitignore

$ git commit -m "Add gitignore"
# But secret.key is still tracked!

$ git status

modified:   secret.key

Root Cause

Key Insight: .gitignore only affects files that are not yet tracked. Already-tracked files must first be removed from tracking using git rm --cached.

According to the Git Official gitignore Documentation, if a file was already git added, you must remove tracking first:

# Stop tracking but keep the local file

git rm --cached secret.key
# Confirm it's added to .gitignore

echo "secret.key" >> .gitignore
# Commit the changes

git add .gitignore

git commit -m "Stop tracking secret.key"

Advanced Technique: Three-Layer Ignore Strategy

# 1. Team shared: .gitignore (committed to repo)

node_modules/

dist/

*.log
# 2. Project private: .git/info/exclude (not committed)

# Suitable for personal IDE settings

.vscode/

.idea/

.DS_Store
# 3. Global settings: ~/.gitignore_global (all projects)

# Setup:

git config --global core.excludesfile ~/.gitignore_global

Best Practices: – ✅ Team collaboration files (like node_modules/) go in .gitignore – ✅ Personal environment files (like .DS_Store) go in ~/.gitignore_global – ✅ Temporary test files go in .git/info/exclude


4. Practical Scenario 2: Cross-Platform CRLF Hell

The Problem

A Windows developer commits code, then Mac/Linux developers pull and see:

$ git diff

-#!/bin/bash^M

-echo "Hello"^M

+#!/bin/bash

+echo "Hello"

The entire file shows as “changed,” but it’s just different line endings (CRLF vs LF).

Solution: Use .gitattributes to Enforce Normalization

Best Practice: Every project should have a .gitattributes file using * text=auto to automatically normalize line endings. – Reason: Prevents CRLF/LF conflicts in cross-platform development – Exception: Pure Windows-only projects may skip this

According to the Git Official gitattributes Documentation, the recommended configuration is:

# .gitattributes

# Auto-detect text files and normalize

* text=auto
# Force specific files to use LF (Unix-style)

*.sh text eol=lf

*.py text eol=lf

*.md text eol=lf
# Force specific files to use CRLF (Windows-style)

*.bat text eol=crlf

*.ps1 text eol=crlf
# Binary files: no processing

*.png binary

*.jpg binary

*.zip binary

Why Do We Need This?

Scenario Without .gitattributes With .gitattributes
Windows commit May include CRLF Git auto-converts to LF for storage
Mac checkout Receives CRLF, diff explodes Unified LF, clean diff
Shell scripts May not execute (CRLF errors) Forced LF, ensures proper execution

Best Practices: – ✅ Every project should have .gitattributes – ✅ Run git add --renormalize . after committing to renormalize – ✅ Team should unify core.autocrlf settings


5. Practical Scenario 3: “What is .gitkeep? Should I Use It?”

The Problem

You want to commit this directory structure:

project/
├── src/
│   └── main.py
└── logs/          # Empty directory, but Git doesn't track it

After committing, the logs/ folder disappeared! Because Git doesn’t track empty directories.

Community Solution: .gitkeep

# Create an empty file so Git tracks the directory

touch logs/.gitkeep

git add logs/.gitkeep

git commit -m "Add logs directory"

Important Concept: .gitkeep is NOT an Official Git Feature

Definition: .gitkeep is a community convention file used to make Git track empty directories. It is NOT part of the official Git specification—any filename (such as .placeholder or .empty) achieves the same effect.

Item Description
Official Support ❌ Not part of Git specification
How It Works Just an empty file making the directory “non-empty” so it can be tracked
Alternative Solutions Can use .placeholder, .empty, README.md, or any filename
Should Commit? ✅ Yes (if the directory must exist)

Real-World Cases: When Do You Need .gitkeep?

# Case 1: Application-required empty directories

uploads/.gitkeep       # User upload directory

cache/.gitkeep         # Cache directory

logs/.gitkeep          # Log directory
# Case 2: Build process directories

dist/.gitkeep          # But should be in .gitignore

build/output/.gitkeep  # Compilation output directory
# Case 3: Docker volume mount points

docker/volumes/db/.gitkeep

Best Practices: – ✅ If the directory is “required at runtime,” use .gitkeep – ❌ If the directory is “build artifacts,” don’t commit (add to .gitignore) – ⚙️ Can also mkdir -p automatically in README.md or scripts


6. Practical Scenario 4: CI/CD and Git File Integration

GitHub Actions Example

# .github/workflows/ci.yml

name: CI Pipeline
on:

  push:

    branches: [ main ]

  pull_request:

    branches: [ main ]
jobs:

  test:

    runs-on: ubuntu-latest

    steps:

      - uses: actions/checkout@v3
      # Check if .gitattributes is effective

      - name: Check line endings

        run: |

          git ls-files --eol | grep 'i/crlf' && exit 1 || exit 0
      # Check for untracked sensitive files

      - name: Check for secrets

        run: |

          if git ls-files | grep -E '\.(env|key|pem)$'; then

            echo "❌ Sensitive files detected!"

            exit 1

          fi

GitLab CI Example

# .gitlab-ci.yml

stages:

  - validate

  - test
validate_git_files:

  stage: validate

  script:

    # Ensure .gitattributes exists

    - test -f .gitattributes || (echo "Missing .gitattributes" && exit 1)
    # Ensure no CRLF issues

    - git ls-files --eol | grep 'i/crlf' && exit 1 || exit 0

Docker and .gitignore Collaboration

# Dockerfile

FROM node:18
WORKDIR /app
# Copy package.json (not affected by .gitignore)

COPY package*.json ./
# Install dependencies

RUN npm ci
# Copy source code (respects .dockerignore)

COPY . .
# Build

RUN npm run build
# .dockerignore (similar to .gitignore, but for Docker)

node_modules/

.git/

.github/

*.log

.env

Key Differences: | Item | .gitignore | .dockerignore | |——|———–|—————| | Affects | Git version control | Docker build context | | When to Use | Decide which files not to track | Decide which files not to send to Docker | | Common Contents | node_modules/, .env | .git/, README.md, *.md |


7. In-Depth Analysis of .git/ Internal Structure (Advanced)

While you should not manually edit the .git/ directory, understanding its structure helps with debugging:

.git/

├── HEAD                  # Points to current branch

├── config               # Project-level configuration

├── description          # Description for GitWeb

├── hooks/               # Git hooks scripts

│   ├── pre-commit       # Execute before commit

│   ├── post-merge       # Execute after merge

│   └── pre-push         # Execute before push

├── info/

│   └── exclude          # Private ignore list

├── objects/             # Git object database (commit, tree, blob)

│   ├── pack/            # Compressed objects

│   └── [0-9a-f]{2}/     # Objects categorized by first 2 chars of hash

├── refs/

│   ├── heads/           # Local branches

│   ├── remotes/         # Remote branches

│   └── tags/            # Tags

└── index                # Staging area

Useful Debugging Commands

# View current branch

cat .git/HEAD

# Output: ref: refs/heads/main
# View commit that branch points to

cat .git/refs/heads/main

# Output: a1b2c3d4... (commit hash)
# View object type

git cat-file -t a1b2c3d4

# Output: commit / tree / blob
# View object content

git cat-file -p a1b2c3d4
# Check database integrity

git fsck --full
# Clean up unnecessary objects

git gc --aggressive

8. Advanced .gitattributes Applications: Merge Strategies and Git LFS

Custom Merge Strategies

# .gitattributes
# package-lock.json uses union strategy during merge (keep both)

package-lock.json merge=union
# Database migration files always use our version

db/migrations/* merge=ours
# Config files always use their version

config/production.yml merge=theirs

Git LFS (Large File Storage) Integration

When projects have large binary files (like videos, 3D models, datasets):

# Install Git LFS

git lfs install
# .gitattributes configuration

*.psd filter=lfs diff=lfs merge=lfs -text

*.mp4 filter=lfs diff=lfs merge=lfs -text

*.zip filter=lfs diff=lfs merge=lfs -text

*.bin filter=lfs diff=lfs merge=lfs -text
# View LFS-tracked files

git lfs ls-files
# View LFS storage usage

git lfs env

Why Do We Need Git LFS? – ❌ Regular Git: Stores complete file on every commit, repository bloats quickly – ✅ Git LFS: Only stores pointers, actual files stored on LFS server


9. File Commit Decision Flow Chart

When you’re unsure whether a file should be committed, use this decision tree:

flowchart TD
    Start([Discovered file starting with .git]) --> Q1{Is it inside .git/ directory?}

    Q1 -->|Yes| Never[❌ Never commit
This is Git internal data] Q1 -->|No| Q2{Is it an official Git file?} Q2 -->|Yes| Q3{Is it global config?} Q2 -->|No| Q4{Is it platform-specific?} Q3 -->|Yes
.gitconfig| Never2[❌ Don't commit
Personal setting] Q3 -->|No
.gitignore
.gitattributes
.gitmodules| Commit[✅ Should commit
Team shared rules] Q4 -->|Yes
.github/
.gitlab-ci.yml| CI[✅ Commit
If using that platform] Q4 -->|No| Q5{Is it community convention?} Q5 -->|Yes
.gitkeep| Keep[⚙️ Situational
Commit when keeping empty dirs] Q5 -->|No| Unknown[⚠️ Unknown file
Verify purpose first] style Never fill:#ff6b6b,color:#fff style Never2 fill:#ff6b6b,color:#fff style Commit fill:#51cf66,color:#fff style CI fill:#51cf66,color:#fff style Keep fill:#ffd43b,color:#333 style Unknown fill:#ff922b,color:#fff

10. Common Questions FAQ

Q1: How to Fix .gitignore Not Ignoring Already-Tracked Files?

A: Use git rm --cached to remove tracking status:

# Remove tracking for single file

git rm --cached secret.key
# Remove tracking for entire directory

git rm -r --cached logs/
# Apply new .gitignore rules

git add .gitignore

git commit -m "Update gitignore and stop tracking files"

Check if it’s working:

git check-ignore -v secret.key

# Output: .gitignore:5:secret.key    secret.key

Q2: Why Can’t My Shell Scripts Execute on CI/CD?

A: Usually a CRLF issue. Check and fix:

# Check file line endings

git ls-files --eol | grep script.sh
# Fix method 1: Use .gitattributes to force LF

echo "*.sh text eol=lf" >> .gitattributes

git add --renormalize .

git commit -m "Fix line endings for shell scripts"
# Fix method 2: Local conversion (temporary)

dos2unix script.sh  # Linux/Mac

Q3: What’s the Difference Between .github/ and .git/?

A: Completely different things:

Item .git/ .github/
Attribute Official Git core directory GitHub platform extension directory
Purpose Stores version control database Stores GitHub Actions, issue templates
Should Commit? ❌ Never commit ✅ Should commit (if using GitHub)
Cross-platform ✅ Works in all Git environments ❌ Only effective on GitHub platform
Examples .git/config, .git/objects/ .github/workflows/ci.yml

Q4: How to Use Different CI/CD on Multiple Platforms (GitHub + GitLab)?

A: Both can coexist:

project/
├── .github/
│   └── workflows/
│       └── ci.yml       # GitHub Actions
├── .gitlab-ci.yml       # GitLab CI
└── azure-pipelines.yml  # Azure DevOps

Notes: – ✅ Each platform runs its own CI/CD – ⚠️ Need to maintain two separate config files – Recommended to use same test commands (like npm test) for consistency


Q5: What Merge Strategies Are Available in .gitattributes?

A: Three main types:

# 1. union: Keep both changes

package-lock.json merge=union
# 2. ours: Always use our version

db/schema.sql merge=ours
# 3. theirs: Always use their version

config/production.yml merge=theirs
# 4. binary: Treat as binary, conflicts must be resolved manually

*.docx binary

Practical Applications:

# Auto-generated files use union

package-lock.json merge=union

yarn.lock merge=union
# Database migration files use ours (protect production)

migrations/*.sql merge=ours

Q6: How to Make git blame Ignore Formatting Commits?

A: Use .git-blame-ignore-revs file:

# .git-blame-ignore-revs

# Formatting commit (Prettier)

a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6
# Refactoring commit (ESLint autofix)

b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6a1
# Usage

git blame --ignore-revs-file .git-blame-ignore-revs file.js
# Set as default

git config blame.ignoreRevsFile .git-blame-ignore-revs

Q7: How to Check All Git Configuration in a Project?

A: Check by layer:

# 1. System-level configuration

git config --system --list
# 2. Global configuration (user-level)

git config --global --list
# 3. Project-level configuration

git config --local --list
# 4. View specific setting

git config user.name

git config core.autocrlf
# 5. View all settings (including source)

git config --list --show-origin

Q8: How to Clean Sensitive Information from Git Repository?

A: Use git filter-repo (recommended) or BFG Repo-Cleaner:

# Install git-filter-repo

pip3 install git-filter-repo
# Remove specific file's history

git filter-repo --path secret.key --invert-paths
# Remove specific text (like API Key)

git filter-repo --replace-text <(echo "SECRET_API_KEY==>REDACTED")
# Force push (warning: rewrites history)

git push origin --force --all

git push origin --force --tags

⚠️ Important: This operation rewrites history, all collaborators must re-clone.


Q9: Why Must .gitmodules Be Committed?

A: Because it records submodule configuration, team members need it for proper initialization:

# .gitmodules example

[submodule "libs/common"]

    path = libs/common

    url = https://github.com/company/common-lib.git

    branch = main
# Initialize submodules (requires .gitmodules)

git submodule update --init --recursive

What happens without .gitmodules? – ❌ Others won’t see submodule directories after clone – ❌ git submodule update will fail – ❌ CI/CD can’t auto-initialize submodules


Q10: How to Safely View .git/ Internal Structure?

A: Use read-only commands:

# View directory structure (limit depth to avoid too much output)

tree .git/ -L 2
# View HEAD pointer

cat .git/HEAD
# View recent commit content

git cat-file -p HEAD
# View object type

git cat-file -t HEAD
# Check repository integrity

git fsck --full
# View all refs

git show-ref

⚠️ Never manually edit files inside .git/ unless you fully understand the consequences.


11. Complete Official vs Non-Official Files Comparison Table

Category File Official Status Should Commit Purpose
Core Layer .git/ ✅ Official ❌ Never Git database
Core Layer .gitconfig ✅ Official ❌ Personal User configuration
Version Control Layer .gitignore ✅ Official ✅ Must Ignore file rules
Version Control Layer .gitattributes ✅ Official ✅ Must Text normalization, merge strategies
Version Control Layer .gitmodules ✅ Official ✅ Must Submodule configuration
Version Control Layer .mailmap ✅ Official ✅ Optional Author name unification
Version Control Layer .git-blame-ignore-revs ✅ Semi-official ✅ Optional Ignore formatting commits
Platform Extension Layer .github/ ❌ Platform ✅ If using GitHub GitHub Actions, templates
Platform Extension Layer .gitlab-ci.yml ❌ Platform ✅ If using GitLab GitLab CI/CD
Platform Extension Layer .gitreview ❌ Platform ✅ If using Gerrit Gerrit Code Review
Community Convention Layer .gitkeep ❌ Convention ⚙️ Situational Keep empty directories
Private Layer .git/info/exclude ✅ Official ❌ Don’t commit Private ignore list
Private Layer ~/.gitignore_global ✅ Official ❌ Don’t commit Global ignore rules

12. Best Practices Checklist

✅ What Every Project Should Have

✅ .gitignore           # Ignore node_modules, .env, *.log

✅ .gitattributes       # Unify line endings (* text=auto)

✅ README.md            # Project documentation

⚙️ .gitkeep             # If empty directories needed (like logs/, uploads/)

⚙️ .git-blame-ignore-revs  # If large-scale formatting commits exist

✅ What Cross-Platform Development Teams Must Have

# .gitattributes

* text=auto

*.sh text eol=lf

*.bat text eol=crlf

*.png binary

*.jpg binary

✅ What CI/CD Projects Should Have

# .github/workflows/validate-git.yml

name: Validate Git Configuration
on: [push, pull_request]
jobs:

  validate:

    runs-on: ubuntu-latest

    steps:

      - uses: actions/checkout@v3
      - name: Check .gitattributes exists

        run: test -f .gitattributes
      - name: Check for CRLF issues

        run: |

          if git ls-files --eol | grep 'i/crlf'; then

            echo "❌ CRLF issues detected"

            exit 1

          fi
      - name: Check for untracked secrets

        run: |

          if git ls-files | grep -E '\.(env|key|pem)$'; then

            echo "❌ Sensitive files tracked"

            exit 1

          fi

13. Conclusion and Key Takeaways

Understanding Git file system boundaries helps you:

  • Avoid conflicts: Unify line endings through .gitattributes
  • Protect privacy: Properly use .gitignore and .git/info/exclude
  • Improve collaboration: Team shares .gitignore and .gitattributes
  • Optimize CI/CD: Leverage .github/ or .gitlab-ci.yml
  • Debug faster: Understand .git/ internal structure

Quick Reference Table

Need Use File Example
Ignore files .gitignore node_modules/, *.log
Unify line endings .gitattributes * text=auto, *.sh eol=lf
Keep empty directory .gitkeep logs/.gitkeep
Private ignore .git/info/exclude .vscode/, .idea/
CI/CD .github/ or .gitlab-ci.yml GitHub Actions
Large files .gitattributes + Git LFS *.psd filter=lfs

Sources


Leave a Comment