Unarchive - Extract Archive Files¶
Learn how to extract archive files with automatic format detection and security protections.
What You'll Learn¶
- Extracting tar, tar.gz, tgz, and zip archives
- Using
strip_componentsto remove leading directories - Idempotency with
createsparameter - Handling different archive formats
- Security protections against path traversal
Quick Start¶
What It Does¶
- Downloads sample archives (or uses provided ones)
- Extracts various archive formats
- Demonstrates path stripping
- Shows idempotent extraction
- Extracts to system directories with sudo
Key Concepts¶
Basic Extraction¶
Extract an archive to a destination directory:
The destination directory is created if it doesn't exist.
Supported Formats¶
Mooncake automatically detects the archive format from the file extension:
| Format | Extensions | Compression |
|---|---|---|
| tar | .tar |
None |
| tar.gz | .tar.gz, .tgz |
Gzip |
| zip | .zip |
ZIP compression |
Detection is case-insensitive (.TAR, .TGZ, .ZIP all work).
Strip Components¶
Remove leading directory levels from extracted paths:
# Archive contains: node-v20/bin/node, node-v20/lib/...
- name: Extract without top-level directory
unarchive:
src: /tmp/node-v20.tar.gz
dest: /opt/node
strip_components: 1
# Result: /opt/node/bin/node, /opt/node/lib/...
How it works:
Archive structure:
project-1.0/src/main.go
project-1.0/src/utils.go
project-1.0/README.md
strip_components: 0 (default) → dest/project-1.0/src/main.go
strip_components: 1 → dest/src/main.go
strip_components: 2 → dest/main.go
Files with fewer path components than specified are skipped.
Idempotency with Creates¶
Skip extraction if a marker file already exists:
- name: Extract application
unarchive:
src: /tmp/myapp.tar.gz
dest: /opt/myapp
creates: /opt/myapp/bin/myapp
mode: "0755"
On subsequent runs, if /opt/myapp/bin/myapp exists, extraction is skipped. This prevents unnecessary re-extraction and maintains idempotency.
Custom Directory Permissions¶
Set permissions for created directories:
- name: Extract with custom permissions
unarchive:
src: /tmp/app.tar.gz
dest: /opt/myapp
mode: "0700" # rwx------
File permissions are preserved from the archive. The mode parameter only affects directories created during extraction.
Extract with Privilege Escalation¶
Extract to system directories using sudo:
- name: Extract to system directory
unarchive:
src: /tmp/app.tar.gz
dest: /opt/myapp
strip_components: 1
mode: "0755"
become: true
Using Variables¶
Use template variables in all paths:
- vars:
app_version: "1.2.3"
install_dir: "/opt/myapp"
- name: Extract versioned release
unarchive:
src: "/tmp/app-{{app_version}}.tar.gz"
dest: "{{install_dir}}"
creates: "{{install_dir}}/bin/app"
strip_components: 1
Extract Multiple Archives¶
Use loops to extract multiple archives:
- vars:
packages:
- name: app
file: app-v1.2.3.tar.gz
strip: 1
- name: data
file: data.zip
strip: 0
- name: Extract {{item.name}}
unarchive:
src: /tmp/{{item.file}}
dest: /opt/{{item.name}}
strip_components: "{{item.strip}}"
creates: /opt/{{item.name}}/.installed
with_items: "{{packages}}"
Security Features¶
Mooncake automatically protects against path traversal attacks:
Blocked Patterns¶
These malicious patterns are automatically blocked:
# Path traversal with ../
Archive entry: ../../../etc/passwd
# Absolute paths
Archive entry: /etc/passwd
# Traversal in nested paths
Archive entry: legit/../../sensitive
# Symlinks escaping destination
Symlink target: ../../../etc/shadow
All extracted paths are validated to ensure they stay within the destination directory.
Security Guarantees¶
- Path Traversal Protection: All entries with
../are rejected - Absolute Path Blocking: Absolute paths are not allowed
- Symlink Validation: Symlink targets are checked for escapes
- Safe Joining: Uses
pathutil.SafeJoin()for all paths
These protections are always active and cannot be disabled.
Complete Example¶
Here's a complete example showing common patterns:
version: "1.0"
vars:
node_version: "20.11.0"
install_dir: "/opt/node"
backup_dir: "/var/backups"
steps:
# Download archive if needed
- name: Download Node.js
shell: "curl -fsSL https://nodejs.org/dist/v{{node_version}}/node-v{{node_version}}-linux-x64.tar.gz -o /tmp/node.tar.gz"
creates: "/tmp/node.tar.gz"
# Extract with strip_components
- name: Extract Node.js
unarchive:
src: "/tmp/node.tar.gz"
dest: "{{install_dir}}"
strip_components: 1
creates: "{{install_dir}}/bin/node"
mode: "0755"
register: node_extracted
# Verify installation
- name: Check Node.js version
shell: "{{install_dir}}/bin/node --version"
when: node_extracted.changed
# Extract ZIP archive
- name: Extract application data
unarchive:
src: "/tmp/app-data.zip"
dest: "{{install_dir}}/data"
mode: "0755"
# Extract backup with sudo
- name: Restore system backup
unarchive:
src: "{{backup_dir}}/system-backup.tar.gz"
dest: "/etc/myapp"
creates: "/etc/myapp/.restored"
mode: "0755"
become: true
Common Use Cases¶
Software Installation¶
Extract and install precompiled binaries:
- name: Install Go
unarchive:
src: /tmp/go1.21.linux-amd64.tar.gz
dest: /usr/local
creates: /usr/local/go/bin/go
become: true
Application Deployment¶
Deploy application releases:
- name: Deploy application
unarchive:
src: /tmp/myapp-{{version}}.tar.gz
dest: /opt/myapp
strip_components: 1
mode: "0755"
become: true
- name: Create version marker
file:
path: /opt/myapp/.version
content: "{{version}}"
state: file
become: true
Backup Restoration¶
Restore from tar backups:
- name: Restore user data
unarchive:
src: /backups/user-data-{{date}}.tar.gz
dest: /home/{{username}}
creates: /home/{{username}}/.restored
Multi-platform Distribution¶
Extract platform-specific archives:
- name: Extract platform binary
unarchive:
src: "/tmp/app-{{os}}-{{arch}}.tar.gz"
dest: /opt/app
strip_components: 1
creates: /opt/app/bin/app
Real-World Example¶
Complete Node.js installation workflow:
version: "1.0"
vars:
node_version: "20.11.0"
node_base_url: "https://nodejs.org/dist"
install_dir: "/opt/node"
steps:
- name: Detect platform
shell: "uname -s | tr '[:upper:]' '[:lower:]'"
register: platform_result
- name: Detect architecture
shell: "uname -m"
register: arch_result
- name: Set Node.js archive name
vars:
platform_map:
linux: "linux"
darwin: "darwin"
arch_map:
x86_64: "x64"
aarch64: "arm64"
arm64: "arm64"
platform: "{{platform_result.stdout}}"
arch: "{{arch_result.stdout}}"
node_platform: "{{platform_map[platform]}}"
node_arch: "{{arch_map[arch]}}"
archive_name: "node-v{{node_version}}-{{node_platform}}-{{node_arch}}.tar.gz"
- name: Download Node.js
shell: "curl -fsSL {{node_base_url}}/v{{node_version}}/{{archive_name}} -o /tmp/node.tar.gz"
creates: "/tmp/node.tar.gz"
timeout: 10m
retries: 3
retry_delay: 30s
- name: Extract Node.js
unarchive:
src: "/tmp/node.tar.gz"
dest: "{{install_dir}}"
strip_components: 1
creates: "{{install_dir}}/bin/node"
mode: "0755"
become: true
register: node_install
- name: Create symlinks
shell: |
ln -sf {{install_dir}}/bin/node /usr/local/bin/node
ln -sf {{install_dir}}/bin/npm /usr/local/bin/npm
ln -sf {{install_dir}}/bin/npx /usr/local/bin/npx
when: node_install.changed
become: true
- name: Verify installation
shell: "node --version && npm --version"
register: versions
- name: Show installed versions
shell: "echo 'Node.js installed: {{versions.stdout}}'"
See Also¶
- File Operations - File and directory management
- Loops - Iterating over multiple items
- Sudo - Privilege escalation
- Actions Reference - Complete action documentation
- Configuration Reference - Property reference