FlexTree

FlexTree is a method for doing “classification” based on a “learning” (alternatively “training”) sample of data. By classification is meant prediction of an unknown outcome “Y” that belongs to one of two, perhaps more, categories on the basis of “features” X. A learning, or training, sample of observations like X coupled with known values Y is given. The prediction, given features but not outcome, is based on the learning sample. The predictor is of rooted, binary tree-structured form and resembles that of CARTR, though it differs greatly in its selection and combination of predictors. They can be qualitative or quantitative. Motivation comes from studies of the genetics of complex disease. There is a need to understand complicated, gene-gene, gene-environment, and other possible interactions that enables ready interpretation and accurate classification. The challenge is most concerning when there are many candidate features and accurate prediction is possible, at least in principle; but the “signal” is carried by only a few features. Each feature that bears upon the signal carries a score. Scores add. The “reward” for sufficiently high score is outcome, typically untoward outcome.

FlexTree is an R package. With current algorithms and software, computation could be an issue. With a typical modern desktop computer, a learning sample of 2,000 subjects and about 50 features, FlexTree should run in about 10 hours. The entire procedure involves growing the tree; suitable validation; and, as with CART, pruning a large tree to one smaller intended for subsequent use. With much larger learning sample size or many more features, computer time might be prohibitive, in which case users should consider other approaches or a pre-filtering of data to reduce the time involved to a manageable level.

STANFORD ACADEMIC FLEXTREE LICENSE AGREEMENT

  1. This is a legal agreement between you, RECIPIENT, and STANFORD UNIVERSITY. By accepting, receiving, and using FlexTree, including any accompanying information, materials or manuals, you are agreeing to be bound by the terms of this Agreement. If you do not agree to the terms of this Agreement, please contact (flextreesupport@gmail.com) with your concerns.

  2. STANFORD grants to RECIPIENT a royalty-free, nonexclusive, and nontransferable license to use the Program furnished hereunder, upon the terms and conditions set out below.

  3. RECIPIENT acknowledges that the Program is a research tool still in the development stage and that it is being supplied as is, without any accompanying services, support, or improvements from STANFORD.

  4. RECIPIENT agrees to use the Program solely for internal non-commerical purposes and shall not distribute or transfer it to another location or to any other person without prior written permission from STANFORD.

  5. RECIPIENT agrees not to reverse engineer, reverse assemble, reverse compile decompile, disassemble, or otherwise attempt to create the source code for the Program.

  6. If permission to transfer the Program is given, under Article 4 above, RECIPIENT warrants that RECIPIENT will not remove or export any part of the software or Program from the United States except in full compliance with all United States and other application laws and regulations.

  7. RECIPIENT will indemnify, hold harmless, and defend STANFORD against any claim of any kind arising out of or related to the exercise of any rights granted under this Agreement or the breach of this Agreement by RECIPIENT.

  8. Title and copyright to the Program and any associated documentation shall at all times remain with STANFORD, and RECIPIENT agrees to preserve same.

If you AGREE, please write to flextreesupport@gmail.com to
  1. Confirm your agreement

  2. Provide information about yourself, including name, organization, title, contact email address, and the purpose of using this software.