\documentclass{article} \usepackage{noweb}% \usepackage{times}% \input nowebmargins% \pagestyle{noweb}% \begin{document} \title{Two Functions for Migrating \textsl{S} Objects} \author{B. Narasimhan\\ Department of Statistics\\ Stanford University\\ Stanford, CA 94305} \date{Version of \today} \maketitle \tableofcontents \section{Introduction} \label{sec:intro} One of the chores of migrating from one platform to another is that \textsl{S} objects need to be migrated too. We provide two functions \texttt{Dump.All} and \texttt{Undump.All} to aid in this process. These functions are meant to be used as follows. Suppose user Joe (userid \texttt{joe}) wishes to move objects from platform $A$ to platform $B$. As a first step, Joe transfers \emph{all} his files from $A$ to $B$ using some program like \texttt{tar}, preserving his directory structure. On platform $A$ Joe invokes $S$ \emph{from his home directory} and executes the command \begin{verbatim} > Dump.All("/var/tmp") \end{verbatim} The argument to the function is a directory which is assumed to have enough space to hold \emph{all} the dumped files. If everything goes well, Joe will have a gzip'ed tar ball called \texttt{joe.tar.gz} in the directory \texttt{/var/tmp}. Joe can now \texttt{ftp} the file over to platform $B$ in \texttt{binary} mode, say, to a directory \texttt{/tmp}, invoke \textsl{S} from his home directory there and type \begin{verbatim} > Undump.All("/tmp") \end{verbatim} to restore all his objects. Some assumptions. \begin{itemize} \item As remarked before, on platform $A$, the directory must be able to hold \emph{all} the dumped files. \emph{Actually, the directory must have enough space to hold rougly twice amount}. I've wondered whether this assumption should be relaxed and some flexibility introduced, but that seems to defeat my main intent---a single command doing everything--but I may be pursuaded. \item The user's shell is C-Shell compatible. \item The user has the same userid on both machines. \item The complete directory structure of $A$ has already been recreated on $B$. \item The \texttt{find}, \texttt{tar}, and \texttt{gzip} commands are available on both platforms. Note that the \emph{find} command misbehaves on some machines. I recommend the GNU tools which are beyond reproach. \end{itemize} \section{Copyright} \label{sec:copyright} We begin with our usual copyright. <>= # # $Revision: 1.1 $ # # Copyright (C) 1996, B. Narasimhan (naras@stat.stanford.edu) # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. # @ \section{The functions} \label{sec:functions} The two functions are, of course, \texttt{Dump.All} and \texttt{UnDump.All}. <<*>>= <> <> <> @ \subsection{The \texttt{Dump.All} function} \label{sec:dumpall-func} The function is quite easy to describe. <>= Dump.All <- function(dest.dir="~", find.cmd="find") { <> data.dirs <- <> for (i in seq(along = data.dirs)) { <> <> } <> <> <> } @ %def Dump.All dest.dir data.dirs @ I will be using \texttt{all.dump.names} to build up a string of all the dump filenames to be fed to the \texttt{tar} program. The correspondence between the dump files and the directories will be stored in the map file. <>= whoami <- unix("whoami") all.dump.names <- "" dump.map.name<- paste(whoami, "map", sep=".") dump.map.pathname <- paste(dest.dir, dump.map.name, sep="/") @ %def whoami all.dump.names dump.map.name dump.map.pathname @ The list of data directories is best found by the \texttt{find} program. <>= unix(paste(find.cmd, ". -name '.Data*' -type d -follow -print")) @ On to dumping \textsl{S} objects. Let us tell the user what is going on. <>= print(paste("Processing", data.dirs[i])) @ Since many people could be doing this at the same time, we want to keep the dump file names unique. So we'll use \texttt{joe.Dump.1}, \texttt{joe.Dump.2} etc., for the dump file names. Also we dump the objects to the file in the specified destination directory. <>= dump.name <- paste(whoami, "Dump", i, sep=".") attach(data.dirs[i]) data.dump(objects(2), file = paste(dest.dir, dump.name, sep="/")) detach(data.dirs[i]) @ %def dump.name @ After a directory is done, we inform the user and update our string of dump names. We also add an entry to the map file. <>= print(paste("Wrote", paste(dest.dir, dump.name, sep="/"))) all.dump.names <- paste(all.dump.names, dump.name) @ The map file needs to be updated to reflect the correspondence between directories and dumps. <>= unix(paste("echo", dump.name, data.dirs[i], ">>", dump.map.pathname)) @ Now we are ready to make our tar ball. With GNU tar, the following could be done in one step since GNU tar can compress and decompress. We'll do it the traditional way. <>= tar.name <- paste(whoami, "tar", sep=".") unix(paste("cd", dest.dir, "; tar -cvf", tar.name, all.dump.names, dump.map.name)) unix(paste("cd", dest.dir, "; gzip", tar.name)) @ %def tar.name @ We can now remove all the dump files and the map file. <>= unix(paste("cd", dest.dir, "; rm", all.dump.names, dump.map.name)) @ The cautionary message. <>= print("Warning: Before relying completely on the dumps") print(" make sure that ") print(" 1) no errors occured AT ALL") print(" 2) you test the objects on the other system") print(" before deleting the .Data directories on the current system.") print("Dumping finished. You can now ftp the file ") print(paste(" ", tar.name, ".gz", sep="")) print("to the other machine. Be sure to use a binary transfer.") return("") @ \subsection{The \texttt{Undump.All} function} \label{sec:undumpall-func} We just have to undo what we did. This is far easier. <>= Undump.All <- function(src.dir) { <> <> <> for (i in seq(along = map$dump.names)) { <> <> <> <> print(paste("Restored objects in", map$dump.dirnames[i])) } return("Undump.All finished.") } @ %def Undump.All src.dir @ To explode the tar ball, we run a unix command. <>= whoami <- unix("whoami") tar.gz.name <- paste(whoami,"tar.gz", sep=".") unix(paste("cd", src.dir, "; gunzip -c", tar.gz.name, "| tar -xf -")) @ %def tar.gz.name @ The map name is \texttt{userid.map}. <>= map.name <- paste(whoami, "map", sep=".") @ %def map.name @ The map file contains two columns, a dump file name and the directory of which it is the dump. <>= map <- scan(paste(src.dir, map.name, sep="/"), what = list(dump.names="", dump.dirnames="")) @ %def map @ Since we have assumed the user transferred all his files first, the \texttt{.Data} directories, though unusable, will exist. Time to get rid of objects in that directory. <>= unix(paste("cd", map$dump.dirnames[i], "; rm -rf *")) @ The rest of the code is self-explanatory. <>= attach(map$dump.dirnames[i]) detach(".Data") @ <>= data.restore(paste(src.dir, map$dump.names[i], sep="/")) @ <>= attach(".Data") detach(map$dump.dirnames[i]) @ \section{Indices} \label{sec:index} \subsection{Code Chunks} \label{sec:code-chunks} This index is generated automatically. The numeral is that of the first definition of the chunk. \nowebchunks \subsection{Index of Identifiers} \label{sec:identifiers} Here is a list of the identifiers used, and where they appear. Underlined entries indicate the place of definition. This index is generated automatically. \nowebindex \end{document}