Design Simulations of Beamforming Wireless Nets

Forums Personal Topics Unbidden Thoughts Design Simulations of Beamforming Wireless Nets

This topic contains 4 replies, has 1 voice, and was last updated by  Josh Stern December 21, 2022 at 5:10 pm.

  • Author
    Posts
  • #125556

    Josh Stern
    Moderator

    Simulations should be able to consider relevant envrionmental conditions along with different protocols for adapting the beam parameters. Q-learners can be considered as part of the control strategy for adaptive optimization.

  • #125567

    Josh Stern
    Moderator

    This old thread about the difference between Back Propagation & Q Learning is interesting.

    The most basic point not expressed concisely is that BP was formulated for a ML paradigm of learning a function X=>Y with many supervised examples of the multivariate function. The error signal for the mapping is the error of the prediction, available at each training sample pair.

    Q Learning is meant to apply to temporal models that are receiving positive utility or negative feedback error at online, ongoing stages of digital time.

    Q Learning doesn’t tell us how to pick meta strategies of bandit problem exploration or how to create meta categories we can notice in the learning. We can fix a network or edit/add network over time. Theory might be available for some ways of proceeding & not overs. But conceptually we are trying to set up adaptive optimization of model choice parameters in ongoing states of experience, using what becomes recency weighted past experience as a guide.

    Q Learning, in that way, is like taking reinforcement learning of static ML into more environmentally realistic situations. Keep that concept in mind.

    • #125578

      Josh Stern
      Moderator

      Side note: there are huge number of research Qs in Q learning that are still without a mathematical model framework.

      Consider these points:

      a) Coming up with new, improved models, which are typically more complex than the previous model, depends on the set of data available for analysis. If costs of saving past data & reanalyzing it are included, what is the computational & pragmatic complexity theory of when re-analysis should be engaged to see if a new model of operations is appropriate?

      Note that heuristically, to apply the past data, the new model needs to artificially restrict its outcome space of policy actions to those that were actually taken, because we don’t know the risks of taking other paths in the past. So that is part of the complexity & part of the motive for looking at small safe trials of new action routes.

      We would also like to see some claims about which models of environment seem most apt in actual trials of interest? For example, mixtures of gaussian payoffs with or without some chance of “death”(infinite penalty) and some kind of drift over time? Is that a fit to some situations of interest? How do we model the situation where new tools/routes become available & change the environment & the set of choices? What is the form of the discontinuity? A new mixture?
      Mixture distributions are painful because they have no sufficient statistics. Can we find good approximately sufficient statistics for situations of interest???

      • #125579

        Josh Stern
        Moderator

        Sufficient statistics are defined to be finite. So what we would mean by adequate approximate stats would be based on approximation theory ideas – pruning an epsilon cover of event space, including payoffs – good enough for measurements.

        A set of stats that grows like logN where N is data events might also be interesting, though getting away from the finite set idea.

You must be logged in to reply to this topic.