Saturday, 13 July 2019

Creating Tar File And GZipping Multiple Files - Java Program

If you want to GZIP multiple files that can’t be done directly as you can only compress a single file using GZIP. In order to GZIP multiple files you will have to archive multiple files into a tar and then compress it to create a .tar.gz compressed file. In this post we'll see how to create a tar file in Java and gzip multiple files.

Using Apache Commons Compress

Here I am posting a Java program to create a tar file using Apache Commons Compress library. You can download it from here– https://commons.apache.org/proper/commons-compress/download_compress.cgi

Make sure to add commons-compress-xxx.jar in your application’s class path. I have used commons-compress-1.13 version.

Steps to create tar files

Steps for creating tar files in Java are as follows-

  1. Create a FileOutputStream to the output file (.tar.gz) file.
  2. Create a GZIPOutputStream which will wrap the FileOutputStream object.
  3. Create a TarArchiveOutputStream which will wrap the GZIPOutputStream object.
  4. Then you need to read all the files in a folder.
  5. If it is a directory then just add it to the TarArchiveEntry.
  6. If it is a file then add it to the TarArchiveEntry and also write the content of the file to the TarArchiveOutputStream.

Folder Structure used

Here is a folder structure used in this post to read the files. Test, Test1 and Test2 are directories here and then you have files with in those directories. Your Java code should walk through the whole folder structure and create a tar file with all the entries for the directories and files and then compress it.

Test
  abc.txt
  Test1
     test.txt
     test1.txt
  Test2
     xyz.txt

Java example code for creating tar

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.GZIPOutputStream;

import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.utils.IOUtils;

public class TarGZIPDemo {

 public static void main(String[] args) {
  String SOURCE_FOLDER = "/home/netjs/Documents/netjs/Test";
  TarGZIPDemo tGzipDemo = new TarGZIPDemo();
  tGzipDemo.createTarFile(SOURCE_FOLDER);

 }
 private void createTarFile(String sourceDir){
  TarArchiveOutputStream tarOs = null;
  try {
   File source = new File(sourceDir);
   // Using input name to create output name
   FileOutputStream fos = new FileOutputStream(source.getAbsolutePath().concat(".tar.gz"));
      GZIPOutputStream gos = new GZIPOutputStream(new BufferedOutputStream(fos));
      tarOs = new TarArchiveOutputStream(gos);
      addFilesToTarGZ(sourceDir, "", tarOs);     
  } catch (IOException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
  }finally{
   try {
       tarOs.close();
      } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
      }
  }
 }
 
 public void addFilesToTarGZ(String filePath, String parent, TarArchiveOutputStream tarArchive) throws IOException {
  File file = new File(filePath);
  // Create entry name relative to parent file path 
  String entryName = parent + file.getName();
  // add tar ArchiveEntry
  tarArchive.putArchiveEntry(new TarArchiveEntry(file, entryName));
  if(file.isFile()){
   FileInputStream fis = new FileInputStream(file);
   BufferedInputStream bis = new BufferedInputStream(fis);
   // Write file content to archive
   IOUtils.copy(bis, tarArchive);
   tarArchive.closeArchiveEntry();
   bis.close();
  }else if(file.isDirectory()){
   // no need to copy any content since it is
   // a directory, just close the outputstream
   tarArchive.closeArchiveEntry();
   // for files in the directories
   for(File f : file.listFiles()){        
    // recursively call the method for all the subdirectories
    addFilesToTarGZ(f.getAbsolutePath(), entryName+File.separator, tarArchive);
   }
  }          
 }
}

On opening the created .tar.gz compressed file using archive manager.

creating .tar.gz file in Java

Recommendations for learning

  1. Java Programming Masterclass Course
  2. Java In-Depth: Become a Complete Java Engineer!
  3. Spring Framework Master Class Course
  4. Complete Python Bootcamp Course
  5. Python for Data Science and Machine Learning

That's all for this topic Creating Tar File And GZipping Multiple Files - Java Program. If you have any doubt or any suggestions to make please drop a comment. Thanks!

>>>Return to Java Programs Page


Related Topics

  1. Zipping files in Java
  2. Unzipping files in Java
  3. Compressing and Decompressing File in GZIP Format
  4. How to convert a file to byte array
  5. Reading Delimited File in Java Using Scanner

You may also like -

  1. How to create deadlock in Java multi-threading - Java Program
  2. Converting int to string - Java Program
  3. Reading all files in a folder - Java Program
  4. How to compile Java program at runtime
  5. How HashMap internally works in Java
  6. Serialization Proxy Pattern in Java
  7. Bounded type parameter in Java generics
  8. Polymorphism in Java

3 comments:

  1. How to preserve file permissions?

    ReplyDelete
  2. With this code Folder structure is lost.
    Say is the folder structure is
    fol1\
    fol2\
    file2.txt
    fol3\
    file.txt

    inside the fol1.tar file , we are not having the folder structure instead we have file like

    fol1\fol2\file2.txt
    fol1\fol3\file1.txt

    folder structure is lost, please help

    ReplyDelete
    Replies
    1. That should't be happening, anyway I have changed the code a bit you can try with this updated and let me know.

      Delete